MCQGeogRpt

8/11/2019 MCQGeogRpt

1/27

USING MULTIPLE CHOICE QUESTION TESTS

IN THE SCHOOL OF GEOGRAPHY

A Report by the MCQ Working Party

[Jim Hogg (Chairman), Alan Grainger, and Joe Holden]

TABLE OF CONTENTS

1. INTRODUCTION 1

2. TOWARDS A NEW FRAMEWORK FOR LEARNING ASSESSMENT 2

2.1 Catalysts for Change 2

2.2 Principles of a New Framework for Learning Assessment 42.3 Implementing the New Framework 7

2.4 How to Test for Different Learning Outcomes Using MCQs 10

2.5 Interactions Between Module Design and Assessment Design 11

2.6 Recommendations to the School Learning and Teaching Committee 11

3 A CODE OF PRACTICE FOR USING MULTIPLE CHOICE QUESTION TESTS 12

3.1 General Aims 12

3.2 Pedagogical Aims 13

3.3 A Framework for the Use of MCQ Tests in Geography 133.4 Obligations to Students Prior to the Test 14

3.5 Design of Test Question Papers 14

3.6 Design of Questions 15

3.7 Procedures for Tests 15

3.8 Processing of MCQ Cards 16

3.9 Feedback to Students 18

3.10 Evaluation and Moderation 19

3.11 Software for Assessing MCQs 19

3.12 Provision of Sample MCQs for Each Module 19

3.13

Tapping Into Existing Banks of MCQs for Ideas 203.14 Sharing Best Practice in the Use of MCQs in Geography 20

3.15 Recommendations to the School Learning and Teaching Committee 20

1. INTRODUCTION

Over the last six months the School has been rethinking its whole approach to teaching, in order to

rationalize programmes within a framework that meets universal benchmarks, and in anticipation of

a continuing rise in student numbers. The changes agreed so far will be accompanied bymodifications to the forms and timing of formative and summative assessment with which we

1


2/27

monitor the development of student capabilities. Significant changes in student learning are also

inevitable, though we have not yet discussed these in detail, and this too will affect our assessment

methods. If we are concerned enough about rising student numbers to overhaul our approach to

teaching, then we must surely give careful consideration to the use of Multiple Choice Question

(MCQ) tools that can cut the time needed to mark tests of large classes and provide assessment that

can be both targeted and broad in coverage.

However, there is far more to the debate about when and how to use MCQ tools. For until now we

have not paid much attention to identifying which forms of assessment are appropriate to measuring

the learning outcomes of particular modules and levels. One reason for this is that we have not

specified the learning outcomes very precisely either. If we do not do this how can we possibly

know which are the most appropriate forms of assessment to employ? It does not make sense to

continue along a path determined only by tradition and organizational inertia. It is just as wrong to

choose traditional assessment tools, such as essay exams, by default as it is to choose MCQ

assessment merely to save time. We seem to have become accustomed to the fact that most students

at Level 1 will inevitably achieve low grades, instead of considering whether it is not the students

that are at fault but our assessment systems. What we should be aiming at is a more rationalapproach to the design of learning and assessment, albeit one that allows for behavioural factors too.

This should lead to the introduction of an appropriate assessment regime for each level and module,

which involves the use of an integrated set of assessment tools that are best suited to monitoring all

anticipated learning outcomes. MCQ tools have an important role to play in such a regime. But is

not only their use which needs to be justified - the same applies to allforms of assessment for every

module.

The report has two main parts. Part 1 proposes a comprehensive Framework for Learning

Assessment in general and identifies the place of MCQ tests within it. Part 2 is a Code of

Practice for designing and implementing MCQ tests within this framework.

2. TOWARDS A NEW FRAMEWORK FOR LEARNING ASSESSMENT

As professional educators, we all try when delivering our modules to inculcate in our students a

knowledge and understanding of specific aspects of our chosen subject and to promote their

acquisition of generic academic and transferable skills. Assessment is essential to monitor the

development of these two main types of capabilities, and Multiple Choice Question (MCQ) tests

have an important role to play in this. Unfortunately, there is still widespread misunderstanding

about what they can and cannot offer. This is often linked to a tendency to favour traditional

assessment methods, such as essay exams, and a lack of appreciation of what is to be assessed and

the best way to do this. This part of the report argues that instead of making decisions on the use of

MCQs in relation to a subjective choice of essay exams as a gold standard assessment tool, we

should adopt a comprehensive assessment framework that can be applied to all modules, and for

each module choose the set of assessment tools that is most appropriate within this framework.

2.1 Catalysts for Change

2.1.1 Moves Toward More Precise and Objective Assessment

It is difficult to overstate the tremendous changes that have taken place in the last ten years in therequired standards for module descriptions. We are now required to specify in advance the

2


3/27

objectives of the modules that we offer, and the particular academic and transferable skills which

students will acquire. Yet this has not yet been matched by a requirement to make our assessment

methods more precise, so we can monitor the extent to which these capabilities are actually

acquired. This situation cannot last for ever, so we would be wise to anticipate future demands for

more precise assessment.

2.1.2 A New Focus on Progression

Another feature of the learning process that will become increasingly important is an ability to

define specific trends in progression in student learning. This has three main parts.

1. Ensuring that a sequence of modules in a given subject strand shows a clear progression

in sophistication from Level 1 to Level 3. The school has already taken steps in this direction by

clearly differentiating between our expectations of what students should achieve in each year:

'exploration and inspiration' in Level 1, 'analysis and criticism' in Level 2 , and 'creativity and

scholarship' in Level 3. We now need to choose assessment tools to match these expectations.

2. Monitoring the progression of individual students in acquiring an increasingly

sophisticated set of academic and transferable skills in moving from Level 1 to Level 3. This is an

entirely understandable requirement in its own right, but it assumes added political significance in

the light of government pressures for widening access to higher education and increasing the value

added to student performance. Suitable assessment tools will be needed for this purpose as well.

3. Monitoring progression within a module, as a student gains more confidence in

acquiring and applying the skills, concepts and tools that are taught in it. Using formative

assessment tools while a module is under way can provide valuable feedback to both students and

staff. The results of objective formative tests can be a valuable guide for evaluating the reliability ofsummativeassessments, such as essay examinations which are marked more subjectively.

2.1.3 Monitoring Long Term Trends in Teaching and Learning Quality

The School may also come under pressure from the University or external bodies to adopt a more

sophisticated approach in monitoring trends in the quality of its overall performance in teaching

and learning. The current focus is on ensuring that we have adequate systems and processes for

quality control and appropriate information records to support this. Yet these requirements, however

onerous, cover only a small fraction of the factors that determine the quality of our performance. In

principle, trends in quality should be apparent in trends in mean or median marks or in the gradedistributions of classes of graduating students. Yet marking regimes have changed over time in

response to our own changing expectations (e.g. previously various references to the literature were

seen as essential for the award of a 2.i grade) and to outside advice on how our standards compare

with those of other institutions. We surely need more objective measures that are unaffected by

subjective policy changes and provide more reliable indicators of long-term trends in the quality of

our teaching. Objective MCQ test results have a key role to play in charting these long-term trends.

2.1.4 Past Responses to the Challenges Posed by Large Classes

Several of us have been forced over the years by the dramatic growth in class sizes to change ourpreconceived notions about MCQs, and recognize that they are a highly efficient and pragmatic way

3


4/27

to assess learning in large classes of 200 to 300 students, and more, for which traditional essay

question exams are impractical. A number of approaches have been adopted:

1. Some of us have not taken this decision after a rational choice of the most appropriate

means of assessment, but have simply identified the method that can be used to examine a large

class and produce marks as rapidly as possible, within the short time allowed before submission

deadlines.

2. Others have looked at the matter in a broader context and decided that written exams are

simply too elaborate a mechanism to apply at Level 1, when it is only necessary to divide a class of

students into Pass and Fail categories. Both this group of colleagues and the first one may have

opted to assess 100% of the module mark by an MCQ Exam at Level 1.

3. Another group has taken the view that MCQ Exams or Tests are best used as part of an

integrated assessment approach and so have complemented them with written tests, presentations

etc. which assess other capabilities and account for a share of the total module mark.

2.1.5 Adapting Assessment to the New Learning Environment

Our discussions over the last few months have been long and complicated. Inevitably, we have had

to give priority to reaching agreement on how to change our teaching. Relatively little attention has

been paid to discussing how these innovations in teaching will impact on the student learning

process. Yet cutting by half the number of lectures used to deliver a similar amount of material as

before will inevitably require not only more independent learning, but also changes in the character

of the learning process to ensure that the quality of learning is maintained, if not increased. Two

innovations in learning, in particular, will have consequences for assessment:

1. We can expect to see a greater use of informal active learning groups to provide

mutual support and stimulation for individual learning and to partly substitute for reductions in

group activity in formal workshops. Module convenors could assess informal group work by

including individual reports/term papers based on the outputs of group discussions.

2. More emphasis will be placed on providing online support for individual learning, and

this may need to be accompanied by online formative assessment. MCQs have proven utility in this

regard and the School must decide whether to use such formative assessment simply to provide

informal feedback to students or whether it will contribute to the final module mark. Technologies

and practices are well established in this area, but if we choose the second option then this raises a

number of issues, including the mechanism of mark delivery to School databases and how to avoid

plagiarism, which will need to be explored by a further investigation by the MCQ Working Party.

2.2 Principles of a New Framework for Learning Assessment

2.2.1 Integrated and Appropriate Assessment

At the core of a new framework for learning assessment is the need to select an integrated set of

appropriate assessment tools. Appropriate assessment involves using the assessment tool best

suited to the particular student capability that we wish to measure. For any given module it is likely

that we will need to measure a number of different capabilities, and so we shall need a number of

appropriate assessment tools, each of which measures one or more these capabilities. An integrated

4


5/27

assessmentregime combines the appropriate assessment tools needed to measure the complete set

of capabilities required, so the strengths of one tool complement the strengths of the others.

2.2.2 A Rational Choice of Assessment Tools

In principle, every module convenor should identify the student capabilities they are trying to

enhance. Consequently, they should be able to justify in a rational way the set of tools needed to

assess these capabilities. Over-assessment has featured regularly in our discussions over the last few

months, and is a consequence of not rationally choosing appropriate assessment tools. This can

occur in two ways:

1. We may assess too many individual pieces of work.

2. We may use an inappropriate assessment tool.

a. The most obvious instance of this is to set an essay exam to test for the

acquisition, understanding and retention of knowledge, when essays are best suited to

assessing more advanced capabilities. This can have a detrimental effect on the overall

grade distribution, because the marks we award for essays are inevitably influenced just as

much by how students structure their basic analysis and argument as it is by their display of

knowledge. Since the latter quality is under-developed in Level 1 students it is no wonder

that modules at Level 1 which rely heavily on essay exams achieve lower marks than those

which use other assessment tools. For example, in 2000-01, the proportions of classes

gaining 2.i grades in two Level 1 modules assessed entirely by MCQ exams were 25% and

15%. In contrast, only 1% of the class taking a module assessed entirely by essay examgained a 2.i.

b. Another instance is when the composition of a single assessment tool is

inappropriate. For example, when a Short Answer Question at Level 1 is graded according

to analytical ability as well as the recall of knowledge. Alternatively, an MCQ test may

contain too many questions which rely on the recall of specific facts, compared with other

types of questions which assess other learning outcomes. A good example of this is

provided by a Level 1 MCQ exam in which the first half had only 25% of all questions

testing factual recall, while in the second half 50% of all questions fell into this category.The mean mark for the first half was 58.4 while that for the second half was 41.8.

2.2.3 Allowing for Behavioural Factors

While a rational choice of assessment tools should be our first priority it is important to recognize

the value of behavioural elements in the learning and teaching process.

1. Students will rarely undertake or put much effort into an assignment unless they receive

marks for it that contribute towards their final module mark. This in itself is one of the causes of

over-assessment, as it can lead to too many pieces of assessed work per module.

5


6/27

2. There is more to learning than sitting in lectures, absorbing the information, and

undergoing tests to see how well this has been achieved. Many other things are needed to make the

learning process enjoyable and fulfilling, and this can make some redundancy in assessment

unavoidable.

3. Students try various strategies to optimize the effort they have to exert to obtain a

desirable level of marks, e.g. by learning answers to anticipated exam questions in relatively few

areas of module content and then adapting them to fit the questions that are actually set. Essay

exams are seldom designed to combat this by testing breadth of knowledge across the whole of

module content. This problem can be partially overcome if a knowledge of two different areas of a

module is needed to answer any one essay exam question, though it is not a perfect solution.

4. Students often do not plan their studies effectively. So they do not gain a proper

understanding of the basic concepts and theories presented in a module while teaching is taking

place. They delay a lot of learning until their revision period and so are ill-prepared for the final

exam. They then rely more on regurgitation of lecture material than on the analysis of concepts,

evaluation of theories etc. In this respect formative MCQ tests during a module can motivatestudents to learn basic concepts and theories earlier. As a result, they are better prepared for the final

summative examination which tests the full range of their capabilities and not simply their

knowledge. Short answer question tests can also be used for this purpose, but marking them is more

expensive and time-consuming.

5. Feedback from formative tests within a module can also provide a powerful stimulus to

students to enhance their learning to improve their grades. Good results can encourage students to do

even better or slacken off, but poor grades will motivate students to work harder. Ignorance of ones

true level of performance, on the other hand, can only perpetuate learning activities that ensure that

students fail to realize their potential.

2.2.4 Controlling Diversity in Assessment When Choosing Integrated Sets of Tools

If we employ an ideal set of multiple assessment tools we should not expect the same mark

distributions as we get from using just one tool and one piece of work. Instead, mark distributions

may tend to be more bunched. There are three reasons for this:

1. Since every student is different and no one is perfect, we should expect most students to

better in one form of assessment than they do in others. So their best marks will be offset by poorer

ones.

2. Low marks in formative assessment conducted early in a module will give

students an incentive to work harder to improve their performance in later assessments.

3. High marks in formative assessment conducted early in a module may give

other students the excuse to take it easy in later assessments so they only do just

enough to pass

There is no magical solution to this problem. An ideal set of assessment tools will measure

individual performance in identified learning outcomes and no more. But a greater diversity

6


7/27

of tools and a greater number of assessed pieces of work may lead to more bunching of

marks than is desirable.

Every module convenor should therefore be responsible for choosing the judicious balance

of assessment which they regard as best for their module, within limits set by the School

Learning and Teaching Committee (SLTC). This should be based on a rational appraisal of

the appropriate assessment tools needed to monitor learning outcomes, tempered by a

proper allowance for behavioural factors, and an awareness of the need to control diversity

in assessment. The key point to emphasize is that we are proposing that identifying an

integrated set of appropriate assessment tools should be seen as essential forallmodules,

whether or not MCQ assessment is used.

2.2.5 Blooms Taxonomy of Learning Outcomes

Bloom's Taxonomy of Learning Outcomes provides a simple conceptual framework foridentifying the appropriate assessment tools needed for particular modules and for

particular levels of a degree course. It portrays the development of intellectual skills in

terms of a progression from the lowest stage of Knowledge Recall (Stage 6 below) to the

highest stage of Evaluation (Stage 1). The definitions below are based on those by Bloom.

1. Evaluation: Judges, criticizes, compares, justifies, concludes, discriminates, supports.

Requires the ability to judge the value of given material for a specific purpose and with respect to

criteria stated or devised by the student.

2. Synthesis: Combines, creates, formulates, designs, composes, constructs, rearranges,

revises. Requires the ability to assemble various elements to create a new whole, in the form of a

structured analysis of a question, a plan (such as a research proposal), or a taxonomy.

3. Analysis: Differentiates between parts, diagrams, estimates, separates, infers, orders,

sub-divides. Requires the ability to break down material into parts and understand how they are

organized.

4. Application: Demonstrates, computes, solves, modifies, arranges, operates, relates.

Requires the ability to use learned rules, methods, concepts, laws, theories etc. in new and concrete

situations.

5. Comprehension: Classifies, explains, summarizes, converts, predicts, distinguishes

between. Requires the ability to understand and/or interpret the meaning of textual or graphical

information

6. Knowledge Recall: Identifies, names, defines, describes, lists, matches, selects, outlines.

Requires the recall of previously learned material.

2.2.6 A Progression of Desired Learning Outcomes During Each Degree Programme

7


8/27

The priority learning outcomes for every module should ideally be framed within the general

progression of learning outcomes throughout our degree programmes. Our discussions over the last

few months have provided a basis for identifying these. For example,

At Level 1, The Year of Exploration and Inspiration' : Knowledge Recall, Comprehension and

Application.

At Level 2, The Year of Analysis and Criticism' : Knowledge Recall, Comprehension, Application,

Analysis (and Evaluation).

At Level 3, The Year of Creativity and Scholarship' : Knowledge Recall, Comprehension,

Application, Analysis, Synthesis and Evaluation.

2.2.7 Matching Assessment Tools to the Progression of Learning Outcomes

It therefore makes sense to devise a general assessment policy for each level that matches the above

progression in learning outcomes throughout each degree programme. This will involve:

1. Identifying the assessment tools which are suitable for monitoring desired learning

outcomes.

2. Choosing typical proportions of final module marks which can be provided by different

assessment tools at particular levels.

3. Specifying appropriate biases in the design of different assessment tools used

at particular levels to match the dominant learning outcomes at those levels.

At the moment such decisions are made more by intuition and tradition than rational

design. It is likely that much of what we are doing now could prove to be justified if

subjected to detailed analysis. On the other hand, if some of our assessment is redundant

and/or inappropriate, as also seems likely, then we may be spending more time on

assessment than we should be, and receiving less information on student performance

than we could get by employing a more appropriate system.

2.2.8 Promoting Progression of Skills When Designing the Progression of Assessment

If we do change the balance between assessment tools employed early in a degree programme then

we must ensure that students have adequate skills to cope with more demanding assessment tools at

higher levels. This is particularly true in the case of essays. Fortunately, we do give intensive

personal tuition on essay writing skills in tutorials at Levels 1 and 2. So if we were to stop holding

essay exams at Level 1 then this should not have a detrimental effect on student performance at

Levels 2 and 3. The area which is perhaps most in need of attention in this regard is over-relying on

Short Answer Questions in some Level 2 modules and then expecting students to make a smooth

transition to achieving high marks in essay examinations in Level 3 modules.

2.3 Implementing the New Framework

8


9/27

2.3.1 General Guidelines for Using Particular Assessment Tools at Each Level

1. At Level 1, for which desired learning outcomes are on the lower stages of the Bloom

Taxonomy, appropriate assessment tools will be those that predominantly measure performance in

knowledge recall, comprehension and application, supplemented by tools that assess other

transferable skills, such as presentations and posters. Consequently, marks derived from MCQ, Short

Answer and Practical Class assessment could account for over a half of the total module marks, and

such tools can be used for both formative and summative assessment.

2. At Levels 2 and 3, when higher stage learning outcomes are more important, MCQ tools

should be restricted to formative assessment and provide less than 50% of the total module mark.

Logically, Short Answer Questions used in summative assessment should also be capped in this

way.

2. MCQ tests at all three levels should contain a mixture of different question styles that

test learning outcomes at various stages of the Bloom Taxonomy. However, in proceeding from

Level 1 to Level 3, there should be an increase in the proportion of the questions that test learningoutcomes at middle or high stages of the taxonomy.

3. Short Answer Questions, which are also best suited to assessing knowledge and basic

explanatory capabilities, can also be used for summative assessment at Level 1. At higher levels they

should be restricted to a set maximum percentage of the total module mark, whether they are used

for formative or summative assessment.

4. Essay Questions, which are best suited to assessing higher learning outcomes, should

account for the majority of the total module marks at Levels 2 and 3. There is a strong argument for

not using them for summative assessment at Level 1, because the potential for students to revealtheir knowledge, comprehension and application capabilities is constrained by their poorly

developed skills in structuring arguments, and this will limit mean module marks. A more rational

approach - albeit a radical one - would be to limit the use of Essay Questions at Level 1 to formative

assessment and only allow them to account for less than half of the total module mark.This should

help to increase marks in Level 1 modules. However, proper safeguards should be introduced, as

suggested above, to ensure a proper progression in essay-writing skills in proceeding from Level 1 to

Level 3.

2.3.2 Guidelines for Matching Assessment Tools to Learning Outcomes and Other Goals

1. Identify Priority Learning Outcomes

Every module convenor should identify the learning outcomes which are the priority for

assessment in a given module. Under the new teaching system all desired learning

outcomes should already be listed in the module outline. However, they are still rather

general and could benefit from further refinement. Then they need to be ranked in order of

priority.

2. Match Tools to Assess Priority Learning Outcomes

9


10/27

An integrated set of appropriate assessment tools should then be selected to monitor priority

learning outcomes. Each tool has a different capacity to assess learning outcomes at different stages

of the taxonomy (Table 1). Generally, essays are best suited to assessing higher stage outcomes,

while MCQs and Short Answer Questions are more suited low to middle stage outcomes - though

MCQs can assess skills in evaluation and, to a lesser extent, synthesis. Bias against MCQs often

results from ignorance of their potential, and an inability to see that all tools have advantages and

disadvantages.

Table 1. The Suitability of Three Tools for Assessing Different Learning Outcomes

Measures: MCQs Short Essays

Answers

1. Evaluation v x v

2. Synthesis x/v x/v v

3. Analysis v v v

4. Application v v v

5. Comprehension v v v

6. Knowledge v v x

3. Justify the Choice of Assessment Tools With Reference to Key Goals

In an integrated set of tools, therefore, the merits of one tool are complemented by those of others.

MCQs offer totally objective assessment, in contrast to the subjectivity involved in marking essay

questions. Short answer questions are of intermediate objectivity (Table 2). While MCQs are prone

to blind guessing, essay questions are vulnerable to bluffing, which in practice makes it difficult to

award a fail mark to answers of a reasonable length. MCQs and essay questions complement each

other in meeting other goals. MCQs can sample student knowledge and capabilities across a widerange of module material, while essays are best suited to assessing depth of knowledge. MCQs can

identify learning errors, but essays are excellent for assessing original thinking (Table 3).

Table 2. The Objectivity of Three Assessment Tools

MCQs Short Essays

Answers

1. Scoring is objective v x/v x

2. Eliminates bluffing v x x3. Eliminates blind guessing x v v

Table 3. The Merits of Three Assessment Tools in Meeting Other Goals

MCQs Short Essays

Answers

1. Samples broadly v x/v x

2. Tests for depth of knowledge x x/v v

2. Pinpoints learning errors v v x3. Assessing originality x x v

10


11/27

4. Refine the Choice of Tools to Assess Other Student Attributes

Tests completed by individual students under examination conditions cannot assess the ability of

students in other areas, such as team work, presentation skills and graphical skills. The School

already employs other forms of assessment to monitor these, such as individual presentations, group

presentations, and posters. However, these have disadvantages as sole assessment tools, e.g. group

presentations are inadequate for assessing individual attainment, and the ranking of outcomes higher

than Knowledge Recall is confused by varying skill in presentation techniques. Posters are good at

assessing knowledge and graphical skills but not higher stage outcomes. As examination essay

questions are not perfect for allowing full individual creativity to blossom, this has given rise to the

use of term papers completed in a students own time. But even these may not in practice reveal the

true extent of student creative and synthetic capabilities, owing to behavioural reasons, e.g. students

allocate insufficient time to their preparation. It is also difficult to eliminate the effects of

collaboration.

5. Further Refine the Choice of Tools to Fit Within Time Constraints

The ideal set of tools may not be appropriate in view of time constraints, such as those imposed by

the need to assess very large classes or complete marking within a very short time period. MCQs do

take longer to devise than short answer and essay questions, but they can be used again in later years

and they take much less time to mark (Table 4). It is possible to save time by using multiple

demonstrators to mark formative Short Answer Question tests, yet even if the most detailed crib is

supplied considerable checking by the course convenor is required to prevent inconsistency between

different markers.

Table 4. Ease of Use of Three Assessment Tools Within Time Constraints

MCQs Short Essays

Answers

1. Easy to devise x v v

2. Easy to repeat v x x

3. Easy to mark v x x

6. Further Refine the Choice and Timing of Tools to Allow for Behavioural Factors

It is often necessary to assess more than can be rationally justified in order to allow for behavioural

factors. Some of these were listed above. In addition, group work may need to be given a high

priority because it increases enjoyment of a module and catalyses the individual learning process.

2.3.3 Selecting an Integrated Set of Appropriate Assessment Tools

MCQ assessment is therefore best viewed as just one of a group of assessment options, each with its

own set of advantages and disadvantages and suitability for assessing different stages of learning

11


12/27

outcomes. An ideal integrated set of appropriate assessment tools will complement the merits and

demerits of one tool by those of others, fit within time constraints and allow for behavioural factors.

MCQ assessment has the following advantages:

a. It can be set at a range of degrees of difficulty and cognitive levels on the Bloom

Taxonomy.

b. It can test for breadth of knowledge across module content.

c. It provides no opportunities for students to gain marks for material which is not on the

question set or is only loosely related to it - as happens often in essay questions.

d. It serves an important behavioural function by encouraging students to remember,

interpret and use the ideas of others. After taking MCQ tests students should be better prepared for

essay exams which test their ability to organize and integrate concepts and express their own ideas.

The objectivity and breadth of MCQ assessment complements the subjectivity and depth of essay

exams. While essay questions are often treated as the 'gold standard', we tend to forget that the

reason why we need extensive moderation of essay exams is that they are highly prone to

subjectivity in marking, however much time examiners devote to trying to be as objective as

possible.

As long as we recognize the inherent criticisms and limitations of MCQ assessment we can use it

selectively and efficiently in place of, and to complement, essays. We have seen in the last few

months that the School is giving a high priority to reducing the time allocated to teaching and

administration, in order to free more time for research. Similar opportunities are also available to

reduce time spent in assessment. Some colleagues have already taken advantage of this option by

using MCQ methods. There is scope for the wider use of MCQs in the School, but this should be

part of a holistic approach to assessment that does not focus purely on saving time.

2.4 How to Test for Different Learning Outcomes Using MCQs

There are still widely held misconceptions about MCQ assessment. For example, MCQ

tests are only pub quizzes that test knowledge of specific facts, and MCQ tests only

assess what students dont know, not what they do know. Such statements are rather

ironic given that some colleagues still award substantial marks for Level 3 essay exam

questions that merely involve the recall of factual information. To prevent further

misunderstanding, this section includes examples of how MCQ tools can assesscapabilities at all but one of the six stages of learning outcomes in the Bloom Taxonomy.

6. Knowledge Recall

MCQ tests can be used to assess whether a student can recall specific pieces of

information learnt during the course. This is not necessarily a bad thing, because students

cannot demonstrate their ability to satisfy higher stage learning outcomes if they do not

have a good knowledge and understanding of the basic concepts and theories which are

needed for this. It should also be remembered that students actually like to learn pieces ofinformation, and in Level 1 the expansion of student knowledge is an important desired

12


13/27

learning outcome. So a judicious proportion of knowledge-based questions is justified in a

Level 1 MCQ assessment.

a. Identify - choose the correct term, concept, theory, person, organization etc.

b. Define - choose the correct definition or correctly complete a definition.

c. Match - choose the correct sequence of items in List B to match the list of itemsin List A.

5. Comprehension

However, even Level 1 MCQ tests should include questions that do not depend on the memory of

specific facts, but instead assess the general understanding of what has been learned during the

module. Comprehension-style questions are highly suitable in this regard.

a. Distinguish between - read a passage and identify which of a list of concepts are/are not

represented there as examples.

b. Distinguish between/explain - choose the correct explanation of a given problem or

phenomenon from a number of alternatives.

c. Summarize - interpret a given chart or map and choose the correct summary

evaluation of its meaning.

d. Convert - using a given map and other information supplied, choose which one

of a list of alternative maps will result from adding this information to the existing spatial

information on the map.

4. Application

Another type of question that does not require knowledge of specific facts is the application

question, in which students need only know a specific formula and procedure so they can

apply it to a practical example. In most modules formulae and procedures are explained in

detail, often with practical examples in the lecture or class, so they should be firmly

embedded in the minds of students.

a. Computes - given appropriate statistical or numerical information, apply a

particular method to calculate the correct value of a given term.b. Relates - relate previously acquired knowledge to a practical example and select the

correct technical term for the phenomenon described.

3. Analysis

Higher stage learning outcomes also build on the basic knowledge and competences of students to

perform sophisticated analysis of theoretical or practical problems.

a. Separate - break down a passage into its component parts and select which of a list of

terms correctly describes a given combination.

13


14/27

b. Differentiate - read a passage which erroneously describes the basic principles

of a given theory and identify which one of a list of words must be changed to ensure that

the passage offers a correct description.

c. Order - use 4-6 different key statistics for three different countries to identify which

countries are referred to, by ranking them according to previously learned non-specific knowledge.

2. Synthesis

Although MCQ questions are not perfectly suited to this stage they can in principle be used

to test ability in choosing the best sequence of a set of items.

a. Combine/design - choose which one of a list of activities should represent the

first stage of a research project. Repeat this for five subsequent questions, each dealing

with subsequent stages.

b. Rearrange - read a list of 7 bullet points which constitute the main ideas conveyed in anessay, and decide which of the following adjustments of the order of ideas would ensure greater

coherency.

1. Evaluation

It is also possible to use MCQ tools to assess the highest of all learning outcomes, evaluation.

a. Judges - read a specimen answer to a given question and evaluate its merits according

to stated criteria, e.g. regarding the level of accuracy of content and the ordering of stated terms.

2.5 Interactions Between Module Design and Assessment Design

Just as the choice of appropriate assessment tools must take account of desired module learning

outcomes, so the selection of module content may also need to take account of the types of

assessment to be employed. This is self-evident when, for example, a lecture is needed to brief

students on the topic chosen for a subsequent workshop, so it should not be unexpected that similar

interactions may be needed at other times. Human geographers, for example, may be less keen than

physical geographers to use MCQ assessment, believing the tool to be less suited to discursive

lectures than to lectures that convey many formulae. But it is not difficult to assess any social

science module effectively with MCQ tools, as long as the lectures are rich in concepts and theories.

The obstacle to good MCQ assessment of social science lectures is not their qualitative approach,but simply the low density of concepts and prevalence of factual information. So if it is simply not

practical to continue examining a large class by essay exams, it may be necessary to increase the

density of concepts and theories communicated in lectures in order to make them easier to examine

by MCQ tools.

2.6 Recommendations

2.6.1. The Working Party was requested to provide a Code of Practice for using MCQ

assessment tools, but we concluded that it was impossible to consider the use of MCQ assessment in

isolation from other tools. There is scope for the wider use of MCQs in the School, but it should

be part of a more holistic approach to assessment that does not focus purely on saving time.

14


15/27

2.6.2. The School should look more carefully than hitherto at all the forms of assessment

which are used for every module. All assessment should be justified rationally, while allowing for

behavioural factors. It is just as wrong to choose traditional assessment tools, such as essay exams,

by default as it is to choose MCQ assessment merely to save time.

2.6.3. The School should identify for each level of a degree programme general learning

outcomes, and priority learning outcomes for each module should be identified within this

framework that can be linked to specific assessment tools.

2.6.4. Module convenors should be asked to identify an integrated set of assessment tools

which are appropriate to measuring the stated learning outcomes of their modules. MCQ assessment

is just one of a group of assessment options, each with its own set of advantages and disadvantages

and suitability for assessing different stages of learning outcomes. In an ideal integrated set of

appropriate assessment tools the merits and demerits of one tool are complemented and enhanced by

those of others, while assessment is designed to fit within time constraints and allow for behavioural

factors.

2.6.5. Our impending switch to a new teaching system is a suitable moment to re-evaluate our

assessment regime. The School should consider how to adapt its present regime to the needs of the

new learning environment associated with the new teaching system, in particular, the likely switch to

more informal group activity and online support for independent learning.

2.6.6. We would also be wise to anticipate future external demands for the use of more precise

assessment tools. Objective MCQ test results have a key role to play in charting long-term trends

for an external audience in the quality of the Schools performance in teaching and learning.

2.6.7. Marks derived from MCQ assessment tools can account for over half of the total mark

of Level 1 modules, for which desired learning outcomes are on the lower stages of the Bloom

Taxonomy, and such tools can be used for both formative and summative assessment. At Levels 2

and 3 MCQ tools should be restricted to formative assessment and provide less than 50% of the total

module mark.

2.6.8. MCQ tests at all three levels should contain a mixture of different question styles that

test learning outcomes at various stages of the Bloom Taxonomy. In proceeding from Level 1 to

Level 3 there should be an increase in the proportion of the questions that test learning outcomes at

middle or high stages of the taxonomy.

2.6.9. We recommend seeking comments on these proposals from members of SLTC, external

examiners, learning and teaching experts within the university, and experts in other universities,

before this report is formally adopted. The aim should be to ensure that assessment frameworks in

the School, and the use of MCQ tools within these frameworks, conforms with norms within this

University and in other universities, in Geography and in other disciplines.

3 A CODE OF PRACTICE FOR USING MULTIPLE CHOICE QUESTION

TESTS

15


16/27

Three main concerns led the School Learning and Teaching Committee to originally establish this

Working Party to devise a Code of Practice for the using MCQ tools in module assessment. First,

that the results obtained from MCQ tests were somehow inferior to those obtained from written tests

and exams. We hope that the first part of this report will have quelled these concerns by showing the

continuity of various assessment tools within the framework provided by the Bloom Taxonomy.

Second, that questions in MCQ tests were not set in a rigorous way. Third, that methods used for

automated marking of MCQ cards were flawed. In the second part of our report we focus on the last

two of these concerns and show that such problems can be considerably reduced, if not entirely

eliminated, if tests are designed and implemented in proper ways.

3.1 General Aims

3.1.1 The School should continue to improve and refine its existing methods for assessing

student performance and to search for innovative methods that will add to these.

3.1.2 Our assessment procedures should be seen by students as being rational, fair and

unbiased and open to scrutiny, if necessary.

3.1.3 We should make effective use of MCQ and other automated methods for student

assessment when this is deemed appropriate, effective and efficient but we must maintain diversity

in our methods of assessing students by different criteria.

3.2 Pedagogical Aims

3.2.1 A comprehensive, appropriate and integrated system of assessing students is important

for satisfying the broad pedagogical objectives of the School.

3.2.2 The MCQ test should be seen as part of a comprehensive, appropriate and integrated

system of assessing students.

3.2.3 To provide a framework for defining the role of MCQ tests within such a

comprehensive system of assessment, the School Learning and Teaching Committee (SLTC) should

define pedagogical aims for each level of study. Our proposals for such a Framework are outlined in

Part 1.

3.2.4 A basis for this framework could be Bloom's Taxonomy, which attempts to divide

cognitive objectives into subdivisions ranging from the simplest behaviour to the most complex (seePart 1).

3.2.5 Since we would expect a gradual progression in knowledge and skills of students from

Levels 1 to 3, the assessment tools used for each year, and the type and complexity of questions,

should reflect this progression.

3.2.6 Progression in cognitive objectives should be reflected in the types of assessment that

we use:

Level 1 Fundamental principles and knowledge. Elementary quantitative techniques.

16


17/27

Level 2 Development and evaluation of skills and engagement with the literature.

Intermediate analysis and evaluation of knowledge. Intermediate quantitative techniques.

Level 3 Critical synthesis, and advanced analysis and evaluation of knowledge to achieve given

objectives. Advanced quantitative techniques.

3.2.7 As our contribution to this exercise we would suggest that::

a. MCQs have the primary functions of assessing: (i) acquired knowledge from lectures,

set texts and other stated resources; (ii) evaluation of theoretical principles, and the ability to use

them in practical applications and analysis; (iii) understanding of and ability to apply quantitative

techniques.

b. Essays have the primary function of assessing knowledge, critical synthesis, analysis

and evaluation in a structured and well-argued written form.

c. Short answer questions are an effective way to assess acquired knowledge and writing

skills.

3.2.8 Within this framework, we consider it to be entirely justified to use MCQ tests for

summative assessment at Level 1, and for formative assessment at Levels 2 and 3. It would seem

entirely logical for SLTC to devise similar guidelines for the use of other assessment tools.

3.3 A Framework for the Use of MCQ Tests in Geography

3.3.1

Any given module in Geography should be able to use a number of different methodsfor assessing students, as are deemed appropriate to the learning outcomes of the module (see Part

1). This may include the use of MCQ tests, provided this is consistent with Item 3.2.8.

3.3.2 The goals, objectives, methods, times and percentage mark for assessing each module

should be stated explicitly in the module outline.

3.3.3 MCQ tests can be used in various ways in a module, e.g. as a number of regular short

(15 minute) tests or as 1-2 longer (45 minute) tests. If the latter option is chosen then the likelihood

of a poorer overall performance in the first test can be reduced by the prior use of short practice

tests.

3.4

Obligations to Students Prior to the Test

3.4.1 The course handout for each module should include the proportions of the total mark

provided by each form of assessment, e.g. essay exam, MCQ exam, course essays, MCQ tests,

workshops, presentations etc.

3.4.2 At the start of each module, students should be given a hand-out and a verbal

explanation of procedures for completing an MCQ test.

3.4.3

This handout should include examples of the types of MCQ used in each test, to showstudents what they can expect.

17


18/27

3.4.4 Published MCQs should include answers. If appropriate they may include explanations

of why each option is correct or wrong.

3.4.5 It is also good practice to prepare students for MCQ tests by holding very short (e.g. 5

minute) snap tests in the middle of lectures. Students can be given the answers and allowed to mark

their own question sheets. The tests are informal and marks are not collected.

3.5 Design of Test Question Papers

3.5.1 Each MCQ should conform to a high standard for clarity of expression, design, layout

and marking, but academics should be free to explore innovative questions as in any other

examination.

3.5.2 Questions should be checked carefully for errors, omissions and ambiguities and other

problems in both the question stem and the five (detractor) options for answering it. The use of five

detractors is the universal standard, and we recommend it be adopted unless exceptionalcircumstances prevent this.

3.5.3 Module convenors are encouraged to request feedback from colleagues on the design

and composition of tests. Ideally, checking for potential pitfalls is best done by several people rather

than just one person.

3.5.4 The start of each MCQ Question Paper should include written instructions on how to

complete the computer card satisfactorily (see Section 3.7).

3.5.5 Each MCQ Question Paper should include instructions as to the type of marking used,e.g. negative marking, no negative marking etc., and preferably also information on the proportion

that the test contributes to the module assessment as a whole. The School should formulate a policy

on negative marking, after receiving expert guidance on the link between this and mark distributions.

3.5.6 Each MCQ Question Paper should be printed on brightly coloured paper. This will

ensure that any other unauthorised paper on desktops is easily recognized. It will also help to prevent

question papers being removed from the room by students.

3.6 Design of Questions

3.6.1 Question stems should be phrased as clearly as possible, leaving no room for ambiguity.

3.6.2 The detractors to each question should all appear equally plausible to someone with a

poor knowledge of module content, but only one answer should be unambiguously correct to those

who have a good knowledge of module content. To reduce the possibility of students finding the

correct answer by guesswork, ensure that all the detractors are internally consistent, they are of

similar length, and the position of the correct answer (i.e. a, b, c. etc.) varies.

18


19/27

3.6.3 There must be only one correct answer to a question with five options. If one option is

correct, then students have a 20% chance of selecting the correct answer.

3.6.4 We should aim to use a variety of different types of MCQ in Geography tests. These

should include questions involving the recall of knowledge, interpretations of graphics, images,

mathematical calculations, judgement, comparisons, comprehension, and evaluation. Experience

suggests that students will perform better in tests that involved a mixture of different types of

questions than in tests that are solely reliant on the recall of knowledge. Typical types of questions

include, in order of increasing difficulty and ascending order in the Bloom Taxonomy:

6. Knowledge Recall

a. Fact

b. Define a term

5. Comprehension

a. Explain terms in a passageb. Explain a technique

4. Application

a. Apply a term

b. Apply a theory

c. Apply a technique

d. Calculate

1. Evaluation

a. Evaluate a technique

b. Evaluate a theory

c. Compare terms

3.6.5 Other examples of the major generic types of questions for MCQ tests are given in Part

1.

3.7 Procedures for Tests

3.7.1 Before each test the Module Convenor should ensure that he or she has collected from

the School (a) the correct number of MCQ Cards and (b) the Schools stock of HB lead pencils (see

item 3.7.5). If the test is to be taken during a University Exam Period the Convenor should check

with Exams Office and/or the School Coordinator to see that these requirements have been met.

3.7.2 To minimize possibilities for cheating, where a test is to take place during a module, the

Module Convenor should (a) ensure on entering the room that students are well spaced apart, with at

least one empty seat between them; (b) remind students they are under examination conditions and

that talking is not allowed until all the cards and question papers have been handed in at the end of

the test. Other possible strategies, such as having different question papers with different sequences

of the same questions, would be difficult to mark and could be quite confusing for students.

19


20/27

3.7.3 At the start of each test the Module Convenor or Examination Invigilator should read to

the class a standardized set of instructions for completing the test satisfactorily. These are listed

below as Items 3.7.5 to 3.7.9. These instructions, which repeat those printed at the top of the

question paper, should avoid many of the problems associated with card marking.

3.7.4 If the test is to be taken during a University Exam Period then either the Convenor

should attend to read out these instructions, or he or she should ensure that the Invigilator has been

properly briefed on their importance and the need to read them out.

3.7.5 The computer cards must be marked correctly using an HB lead pencil and no other

implement or type of pencil. Otherwise the card reader, which is very sensitive to the density of

graphite on the card, will reject the card. Students without an HB pencil should be invited to borrow

one.

3.7.6 Students should first fill in their University Student Identification Number by filling in

boxes on the card. They should also write their name and number (a) on the top of the card in case

the computer cannot read it; and also (b) on the top of the question paper (to serve as a backup andto emphasize to students that the question paper must not be taken away afterwards).

3.7.7 Students should fill each box that they select by shading the box with an HB lead pencil.

If they simply put a tick, cross or dot in the box, the card reader will reject the card. The person

responsible for processing the cards then has to shade over the boxes before the card reader will

accept it.

3.7.8 If a student makes a mistake when filling in an answer and attempts to rub out a mark,

they should take rub out all the pencil mark and not smudge it over several boxes. The card reader

will reject any card that is smudged badly. Students without erasers should put a light diagonal slashthrough the incorrect selection, so it can be rubbed out later by the person who processes the cards.

3.7.9 To reduce the chance of smudging at the correction stage, students should be advised to

make their first selection by using a light shading, and only make it as dark as possible when they

are sure that this is the selection they really want to retain.

3.7.10 The question paper should be collected immediately after an MCQ test, to avoid it going

into general circulation among students. This is particularly important if an MCQ test is used in a

University Examination.

3.8 Processing of MCQ Cards

3.8.1 The Module Convenor should take full responsibility for understanding and supervising

the procedures for processing MCQ cards. The most frequent practical problems that are

encountered involve the computer-marked card reader and the computer software for processing

cards. The problems are usually blamed on the card reader, software or some other feature, but they

usually boil down to unfamiliarity with these.

3.8.2 The processing of MCQ cards is actually a relatively simple matter to those familiar

with the system. But there are a number of pitfalls which can cause problems and which often lead

colleagues to regard the system as useless. However, with proper training and support the system

can work well in the School, as it does in many other departments of this and other universities.

20


21/27

3.8.3 Before taking the cards for marking by the card reader, the Module Convenor should

check every card to see that the student has followed the correct procedure in the test. There are two

main types of incorrect completion:

a A student has used a pencil with the wrong density lead. To remedy this the Convenor

must darken all completed boxes on the card with an HB pencil, so the card reader will read the

card.

b Ink, biro, felt tip pen or liquid pencil has been used. These cards must be copied onto a

new card by the Convenor using an HB lead pencil.

3.8.4 The Module Convenor should produce, using an HB pencil, a Master Card containing

the correct answers to all the MCQs.

3.8.5 Two types of card-marking software are now in general use in the University: Wincards

(which is installed on the Schools system) and Javacards 2002 (installed on the FLDU system).

3.8.6 To mark cards using the Wincards software:

a. The Convenor should obtain from the Student Information System (Banner) a list of

students registered for a module. The list is usually provided as a Microsoft Excel file. This contains

the names and student identification number of each student. This file must then be transformed

manually and saved in Lotus 123 (.WKS) file format. The software for processing MCQ cards will

then read this Class File (it will not accept any other file format) and use it to link the cards to

students. The Module Convenor should also enquire as to any other special requirements of the

present version of the card marking software (e.g. that all zeros at the beginning of the StudentNumber should be removed).

b. The Master Card is fed into the card reader. A visual display of the Master Card is

shown on screen and this is checked to make sure that the Master Card is correct. As all other cards

are compared to the master card, it is essential that the Master Card is correct.

c The card for each student is then fed into the card reader twice. The first reading is

checked against the second reading. If they are the same, then the students number, name and marks

appear in a list on the display. If there is a difference between the first and second reading of a card,

an error is signalled showing where it occurs on a card. This error is typically a smudge, partly filledbox or omitted detail. The person processing the cards then has to make an appropriate correction to

the card.

3.8.7 To mark cards using the Javacards 2002 software (which will work with any version of

Windows from Windows 95 upwards, including Windows XP):

a. Obtain class lists by

i. Logging in to Boddington Common

ii. Going to the User Directory in the Information Floor of the Boddington Common

Gatehouse.

iii. Select Find Users (Advanced).

21


22/27

iv. In the "Search Within Group" list select leeds_uni.registry. subjects.geog .

v. In the "Search by Name or Unique Identifier" list select user name, i.e. GEOGxxxx.

vi. Type * into the Search For box.

vii. Select the XML file radio button.

viii. Click on the Search push button.

ix. Wait for the complete page of results to appear.x. Use the Save or Save as command in your Web browser to save the file, ending

with. ".xml" and not "html". Save it in the m:carddata directory. Before saving be sure to

click on the window containing the class list, otherwise the correct file will not be saved.

b. Enter details in the set-up dialogue box

i. Select a class list file by using the second Browse push button.

ii. Enter the module code and test identifier (e.g. 2 = Test 2)

iii. Press the OK button.

c. Press the New Master button and pass your Master Card through the card reader. Check

it carefully on the screen.

d. Pass the cards through the reader. Cards only have to be passed through once. But you

must compare the card visually with the image displayed on the screen. If there is only one correct

answer to each question the software automatically assumes that you are not using negative marking.

e. One of the innovations in the new software is that if the card has been misread, or the

card has problems owing to faulty completion, you can use the mouse to edit the card on the screen.

So if the card reader shows as a valid selection a smudged box which the student tried to rub out,

then you can remove this on the screen and you do not need to either use an eraser on the original

card or prepare an entirely new copy of the card in order to process it. This saves a lot of time. As a

check, the full details of the card are stored in a file, so you can see which boxes were manually

removed. You can also use the "Edit comments for student" command to add notes to each card file

in order to make a record of the reason for your edits .

f. The new software does not have an automatic Save function, so you need to save the

results file regularly, e.g. every ten cards, and when you have read the last card.

g. When you have finished reading the cards the software will automatically generate a

series of files, so you will find these files in the directory:i. geog.xxxx.xml (class list)

ii. geog.xxxx_2.xml (raw data file)

iii. geog.xxxx.stats2.bxt (contains the Facility Index for each question, which is the

number of correct answers for each question relative to the total number of students in the

group.)

iv. geog.xxxx.stats.csv (contains the Frequency Measure, which shows how many students

answered each question)

v. geog.xxxx.sumy.tsv (results file in tabbed text form - can be opened in Excel)

22


23/27

h. Copy all these files onto your floppy disk so they can be then copied onto your

own computer.

3.8.8 Marks should be processed as soon as possible following the test and an assessment of

the results should be made by the convenor. This assessment must identify any questions that a

majority of students have been answered correctly or have not been able to answer correctly. TheMCQ test software can assist convenors in this task as it generates a Facility Index, a Frequency

Measure, and a Discrimination Index (see Section 3.10.1) for each answer. The test results are

available on request from Wincards, but are produced automatically by Javacards 2002 as separate

files.

3.9 Feedback to Students

3.9.1 We should aim to give students feedback as a matter of routine in all of our MCQ tests.

This converts the MCQ test from a summative to a formative assessment method. Students

generally welcome feedback and more readily accept the outcome if questions and answers areexplained and seen to be reasonable, fair and part of a module.

3.9.2 If it is subsequently found that mistakes have been made in setting questions then we

should admit the error, explain how it came about and inform students that the question will be

changed in future. We should inform students that the question has been withdrawn from the current

examination (if this is appropriate) so that no one will be jeopardised. In the same way, questions

could be withdrawn if they are subsequently found to be answered correctly by only a small

proportion (e.g. 5%) of the class.

3.9.3 Posting results on the student notice board is highly recommended (if allowed by the

examination procedures). Students like to know where they stand and MCQ tests give them one

measure. It generates competition and interest and can lead to improvements in performance in

subsequent MCQ tests. Remember to use grade labels (i.e. A, B) if the results are provisional as

they will contribute to a total module mark.

3.9.4 Course convenors should aim to give students the results of a test within two weeks of it

being held, in accordance with School policy.

3.10

Evaluation and Moderation

3.10.1

When the MCQ cards have been processed, the software can generate useful statisticsfor each MCQ Test. These include:

a. A Facility Index, which is simply the number of correct answers for each

question relative to the total number of students in the group. This measures how easy or

difficult a question was.

b. A Frequency Measure which shows how many students answered each

question. A high frequency of wrong answers indicates a problem, as does a high

frequency of omitted answers.

23


24/27

c. A Discrimination Index that measures the proportion of students confused by each

question.

3.10.2 These statistics are extremely useful for evaluating an MCQ test. They should be used

routinely by convenors and moderators to identify any problems with a test. (Note that the new

JavaCards 2002 software does not generate as much information as the older WinCards (see section

3.8.7).

3.10.3 Using the marks stored in an Excel spreadsheet the Module Convenor can generate

further statistics, including the mean, standard deviation and upper and lower quartile marks for the

group.

3.10.4 The SLTC should take a policy decision on the desirability of the Convenor and/or the

Moderator adjusting the overall marks of a particular MCQ Test or Exam. The general policy at

present is not to adjust marks unless the Moderator and/or Convenor feel there are exceptional

circumstances.

3.10.5 Since the computer marked cards provide a permanent record of a test they should be

securely stored in a box together with the Master Card. This enables an audit to be undertaken if

desired. Comparing results from sequential years can also provide valuable information about

relative performance, even when the questions differ. We recommend that sets of cards for Level 1

Exams and Tests be retained for 1 year and those for Level 2 and Level 3 Exams and Tests for 2

years.

3.11 Software for Assessing MCQs

3.11.1

The WinCards software for processing of MCQ cards was developed in the University

by Andrew Booth and the new JavaCards 2002 software by John Hodrien and Andrew Booth. Both

run on an IBM PC compatible computer in Microsoft Windows. The software is relatively easy to

learn and use effectively and WinCards is installed on a computer connected to a card reader in the

Schools Taught Courses Office.

3.11.2 MCQ tests may also be taken in computer laboratories using online MCQs. Question

papers should be prepared in the usual manner but taken to Dr Andrew Booth who is currently

Director of the Flexible Learning Development Unit in the University. He will make the MCQs

available over the university network using custom software he has developed. Students select

answers to MCQs online. Answers are processed automatically and the results are returned toModules Convenors by email.

3.12

Provision of Sample MCQs for Each Module

3.12.1 The design and writing of good multiple-choice questions to cover the material in a

module takes time, practice and patience but the reduction in time spent on exam marking can be

significant, especially for large classes of students. Investing time in researching different types of

multiple-choice questions and answers can yield benefits in terms of the speed and time spent of

marking.

24


25/27

3.12.2 Convenors should aim to create a bank of MCQs to cover the material in their module.

The MCQs should generally be of different types and meet different cognitive objectives of the

module.

3.12.3 Students should be given a sample of typical MCQs for each module. The sample

should provide answers to each question, indicating why some answers are correct and some are

wrong with appropriate explanations. Lecturers should generally go over the sample questions

during a class so that students become familiar with what will be expected on them during an MCQ

test.

3.13 Tapping Into Existing Banks of MCQs for Ideas

3.13.1 Lecturers can also explore existing banks of MCQs for ideas on how to structure their

own questions. There are many books containing sets of sample questions from other disciplines.

Banks of questions are available in EndNote, Access and other software formats. Their chief benefit

is to give ideas on how to construct different types of MCQs, not to use them as such. The Internet

contains a wealth of sources of MCQ tests in various disciplines, including some aspects ofgeography.

3.13.2 Some MCQ tests prepared and tested in other disciplines and available in texts might

also be of interest to geographers.

3.14 Sharing Best Practice in the Use of MCQs in Geography

14.1 To improve practice in the School, outside speakers should be invited to give a

workshop on the use of MCQ tests in Geography and be encouraged to cover all aspects of using

MCQ tests.

3.14.2 Diverse procedures and computer systems for MCQ tests are now used in universities in

the UK and USA. These should be monitored and evaluated to determine if we might benefit from

them.

3.15 Recommendations to the School Learning and Teaching Committee

3.15.1 MCQ assessment should be seen as part of a comprehensive system of selecting an

integrated set of appropriate assessment tools for each module at a given level. No particular tool

should be seen as inferior or superior to any other, but merely evaluated as to whether it isappropriate, and how it can complement other tools. Recommendations on how to do this are made

at the end of Part 1 of this report.

3.15.2 To provide an outline framework for defining the role of MCQ tests within such a

comprehensive system of assessment, the SLTC should define pedagogical aims for each level of

study.

3.15.3 As we should expect a gradual progression in knowledge and skills of students from

Levels 1 to 3, the assessment tools used for each level, and the type and complexity of questions set

in each assessment tool, should reflect this progression.

25


26/27

3.15.4 Within this framework, we consider it to be entirely justified to use MCQ tests for

summative assessment at Level 1, and for formative assessment at Levels 2 and 3.

3.15.5. Marks derived from MCQ assessment tools can account for over half of the total mark

of Level 1 modules, for which desired learning outcomes are on the lower stages of the Bloom

Taxonomy, and such tools can be used for both formative and summative assessment. At Levels 2

and 3 MCQ tools should be restricted to formative assessment and provide less than 50% of the total

module mark.

3.15.6. MCQ tests at all three levels should contain a mixture of different question styles that

test learning outcomes at various stages of the Bloom Taxonomy. In proceeding from Level 1 to

Level 3 there should be an increase in the proportion of the questions that test learning outcomes at

middle or high stages of the taxonomy.

3.15.7 Students should be fully informed beforehand of the content of MCQ tests and their role

in completing them, and be given prompt feedback on results.

3.15.8 Colleagues employing MCQ assessment should be asked to follow this Code of

Practice, particularly as it relates to the design, implementation and processing of tests. This should

avoid, if not eliminate, many of the problems that have occurred until now.

3.15.9 The SLTC should take a policy decision on the desirability of whether the Convenor

can, on the advice or with the agreement of the Moderator, adjust the overall marks of a particular

MCQ Test or Exam. The general policy at present is not to adjust marks unless the Moderator and/or

Convenor feel there are exceptional circumstances. We recommend that this policy be maintained,

as the need for moderation is predominantly a result of the subjectivity associated with the marking

of essay exams.

3.15.10 The SLTC should discuss whether it should take a policy decision on the desirability of

the use of negative marking, or leave it to the individual module convenor to decide.

3.15.11. The SLTC should appoint an MCQ Coordinator with a responsibility to: (a) Keep this

Code of Practice under review; (b) Provide advice to members of staff on MCQ design and

implementation and card reader operation procedures; (c) Report to the Head of Department when

hardware or software problems render the MCQ marking system non-viable and remedial action is

required; (d) Liaise with the Director of Learning and Teaching, the Examinations Officer and the

Departmental Coordinator; (e) Advise the School on improvements in best practice within theUniversity and outside.

3.15.12 We commend this Code of Practice to the SLTC, and would welcome any comments

and suggestions for amending it, from the committee, external examiners, learning and teaching

experts within the university, and experts in other universities, before it is formally adopted. The aim

should be to ensure that the use of MCQ tools in the School conforms with norms within this

University and in other universities, in Geography and other disciplines.

26


27/27

MCQGeogRpt

Documents

Transcript of MCQGeogRpt