Cognitive Load Factors.pdf
Transcript of Cognitive Load Factors.pdf
-
8/18/2019 Cognitive Load Factors.pdf
1/116
-
8/18/2019 Cognitive Load Factors.pdf
2/116
COGNITIVE LOAD FACTORS IN
INSTRUCTIONAL DESIGN FOR
ADVANCED LEARNERS
No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
-
8/18/2019 Cognitive Load Factors.pdf
3/116
-
8/18/2019 Cognitive Load Factors.pdf
4/116
COGNITIVE LOAD FACTORS ININSTRUCTIONAL DESIGN FOR
ADVANCED LEARNERS
SLAVA KALYUGA
Nova Science Publishers, Inc. New York
-
8/18/2019 Cognitive Load Factors.pdf
5/116
Copyright © 2009 by Nova Science Publishers, Inc.
All rights reserved. No part of this book may be reproduced, stored in a retrieval system
or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape,
mechanical photocopying, recording or otherwise without the written permission of the
Publisher.
For permission to use material from this book please contact us:
Telephone 631-231-7269; Fax 631-231-8175
Web Site: http://www.novapublishers.com
NOTICE TO THE READER
The Publisher has taken reasonable care in the preparation of this book, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or
omissions. No liability is assumed for incidental or consequential damages in connection
with or arising out of information contained in this book. The Publisher shall not be liable
for any special, consequential, or exemplary damages resulting, in whole or in part, from
the readers’ use of, or reliance upon, this material.
Independent verification should be sought for any data, advice or recommendations
contained in this book. In addition, no responsibility is assumed by the publisher for any
injury and/or damage to persons or property arising from any methods, products,
instructions, ideas or otherwise contained in this publication.
This publication is designed to provide accurate and authoritative information with regard
to the subject matter covered herein. It is sold with the clear understanding that the
Publisher is not engaged in rendering legal or any other professional services. If legal or
any other expert assistance is required, the services of a competent person should be
sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY ACOMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF
PUBLISHERS.
LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA
ISBN: 978-1-60741-685-2 (E-Book)
Available upon request
Published by Nova Science Publishers, Inc. New York
-
8/18/2019 Cognitive Load Factors.pdf
6/116
CONTENTS
Preface vii
Chapter 1 Basic Architecture of Human Cognition 1
Chapter 2 Cognitive Studies of Expert-Novice Differences
and Design of Instruction 21
Chapter 3 Cognitive Load Perspective in Instructional Design 35
Chapter 4 Cognitive Load Principles in Instructional Design
for Advanced Learners 69
Summary Toward a Cognitively Efficient Instructional
Technology for Advanced Learners 91
Index 99
-
8/18/2019 Cognitive Load Factors.pdf
7/116
-
8/18/2019 Cognitive Load Factors.pdf
8/116
PREFACE
The empirical evidence described in this book indicates that instructional
designs and procedures that are cognitively optimal for less knowledgeable
learners may not be optimal for more advanced learners. Instructional designers or
instructors need to evaluate accurately the learner levels of expertise to design or
select optimal instructional procedures and formats. Frequently, learners need to
be assessed in real time during an instructional session in order to adjust the
design of further instruction appropriately. Traditional testing procedures may not
be suitable for this purpose. The following chapters describe a cognitive load
approach to the development of rapid schema-based tests of learner expertise. The
proposed methods of cognitive diagnosis will be based on contemporary
knowledge of human cognitive architecture and will be further used as means of
optimizing cognitive load in learner-tailored computer-based learning
environments.
-
8/18/2019 Cognitive Load Factors.pdf
9/116
-
8/18/2019 Cognitive Load Factors.pdf
10/116
Chapter 1
BASIC ARCHITECTURE OF HUMAN
COGNITION
A cognitive approach to human learning emphasizes the internal cognitive
mechanisms of learning. Such mechanisms are usually described as
transformations performed on various mental representations of situations and
tasks. An important assumption of the approach is that a single general cognitivesystem underlies human cognition. Different theoretical approaches specify this
general cognitive system as corresponding cognitive architectures. The
understanding of human cognition within a cognitive architecture requires
knowledge of corresponding models of memory organization, forms of knowledge
representation, mechanisms of problem solving, and the nature of human
expertise.
MEMORY ORGANIZATION
The major characteristics of human memory are its strength or durability,
capacity (number of items of information stored in memory), and speed of access.
According to these characteristics, memory is divided into long-term memory and
short-term memory. Long-term memory (LTM) is characterized by high strength
and includes well-learned knowledge, for example, the name of the first US
President, 5 x 5 = 25, or the spelling of the word potatoes. It is presumed to have
unlimited capacity, although the access to the stored information could be slow.
Both the strength of memory and the speed of access increase with practice. More
-
8/18/2019 Cognitive Load Factors.pdf
11/116
Slava Kalyuga2
fully elaborated and more deeply processed material results in better long-term
memory.
Short-term memory (STM), on the other hand, includes information that has
been just encoded from sensory registers or retrieved from long- term memory,
for example, what have you been thinking about just before this? what are you
thinking about when dialing the phone number 8344 2124?. The durability of
STM is a matter of seconds (Peterson & Peterson, 1959), and information in STM
could be accessed very rapidly. The number of items of information that can be
maintained in an active state simultaneously in STM is about seven units for most
people (Miller, 1956). For example, it is very difficult for us to recall more than
approximately seven serially presented random numbers (e.g., an unfamiliar
phone number) a few seconds after we hear or see them, unless the numbers have
been intentionally rehearsed. When asked to copy strings of digits from one page
to another, we usually do this by grouping the digits by easily manageable units of
three or four at a time.
The most generally specified basic human cognitive architecture includes
these two substructures (STM and LTM). Examples are the standard model
(Newell & Simon, 1972) and modal model (Atkinson & Shiffrin, 1968; Waugh &
Norman, 1965). In more specific models, these substructures might be regardedeither as a single memory store with different modes of activation for long-term
and short-term components, or as separate memory stores. These distinctions are
not essential when considering the basic level of cognitive architecture. However,
in order to explain human cognition, this general model needs to be supplemented
by some attention control mechanism (central processor or central executive)
which determines what information from sensory stores or LTM is brought into
STM. The information that is actually attended to is limited to a small number of
chunks in STM (Simon, 1979; Ericsson & Simon, 1993a, 1993b).
Various cognitive architectures and elaborations of the general model extendthe described memory structure. For example, the concept of working memory
(WM) was introduced to account for processing of units of information that are
interconnected, rather than random, and should be processed concurrently because
of the nature of things they reflect or due to established associations in long-term
memory. Working memory is considered as "a system for the temporary holding
and manipulation of information during the performance of a range of cognitive
tasks" (Baddeley, 1986, p. 34), a “desktop of the brain … that keeps track of what
we are doing or where we are moment to moment, that holds information long
enough to make a decision, to dial a telephone number, or to repeat a strangeforeign word that we have just heard” (Logie, 1999, p.174). Some simple
examples of working memory operation could be provided by the following tasks:
-
8/18/2019 Cognitive Load Factors.pdf
12/116
Basic Architecture of Human Cognition 3
close your eyes and pick up a pen in front of you; count the number of windows in
your house or apartment; mentally rearrange the furniture in your room, ormentally complete a mathematical operation (for more examples, see Logie,
1999).
After incoming stimuli from an external source are registered in sensory
memory, perceived or matched to recognizable patterns by using prior knowledge
(if any) in LTM and context, and are paid attention to, they are transferred into
WM. If a unit of information is not recognized due to the lack of appropriate LTM
patterns, it still could be attended to and processed in WM, with appropriate
cognitive resources allocated for the task. Attended units of information in WM
are assigned meaning and used for constructing integrated mental representations
of a situation or task (Figure 1). This information, however, may fade very
quickly if attention is diverted or if the capacity of WM is overloaded.
Baddeley and Hitch (1974) first proposed that WM performed both
processing and storage functions. They suggested three structural components of
working memory: a central executive and two separate auditory and visual stores
for handling verbal information and visual images. These two stores serve as
maintenance systems controlled by the central executive and are called
respectively an articulatory or phonological loop (‘inner voice’) and a visuospatialsketchpad (‘inner eye’). The limited capacity of the central executive is used for
processing incoming information, with the remainder used for the storage of
intermediate and final products of that processing. Storage and processing
capabilities of WM trade off against each other. When memory load increases
above some threshold, our performance could be inhibited. To get a feeling of
WM limitations, try to mentally add two large numbers (for example, 83 468 437
and 93 849 040). For a concurrent task, you may try also to attend simultaneously
to a comedy show on your TV. It would be very difficult to do because each of
these activities alone may take all of your WM resources.There are three major functional aspects of working memory operation:
temporary storage, manipulation of information, and executive control.
Temporary storage of information was the focus of classic models of STM and
was studied using standard word or digit STM span tasks. These were simple
tasks involving recalling a list of digits or unrelated words and not requiring much
prior knowledge. Active manipulation of information has been the focus of
models of WM and has been studied using WM span tests that require concurrent
processing of several tasks. These are relatively more complex tasks involving
meaningful cognitive operations such as reading sentences or performingnumerical transformations, and then recalling the final words of those sentences or
results of the math operations. Performance of complex cognitive tasks requires
-
8/18/2019 Cognitive Load Factors.pdf
13/116
Slava Kalyuga4
simultaneous use and integration of various sources of information, coordination
of separate processes and representations. It is the executive functioning of WM,interactions between WM and LTM knowledge structures that have become the
focus of research in recent years (see Miyake & Shah, 1999, for a recent overview
of WM models and the state of the field).
A number of hypotheses have been proposed to explain individual differences
in WM capacity and its relation to performance. These theories considered
differences in total WM capacity, differences in processing efficiency of WM, or
both. According to the total capacity approach (Baddeley & Hitch, 1974; Cantor
& Engle, 1993; Case, 1985; Engle, Cantor, & Carullo, 1992), all cognitive
processes require resources from a fixed pool. Any resources not allocated to the
operations can be used for short-term storage. The storage and processing
capabilities of working memory trade off against each other. When memory load
increases above some threshold, a person’s performance may decline. A change in
total capacity caused, for example, by fatigue or age should affect the
performance in a wide range of tasks.
Constructing mentalrepresentations of asituation or task
Long-TermMemory
Knowledge base
Working Memory
Sensory Memory: Incoming information
Figure 1. Basic architecture of human cognition.
-
8/18/2019 Cognitive Load Factors.pdf
14/116
Basic Architecture of Human Cognition 5
The task-specific hypothesis (Daneman & Carpenter, 1980) assumed that
WM capacity is specific to the particular task being performed. Efficient processing skills leave more WM capacity for storage of processing products. A
change in processing efficiency should be specific to a particular task and result
from intensive practice or training (Just & Carpenter, 1992). Performance would
be influenced only if available resources are in short supply when a person
operates at the limit of WM capacity. The processing efficiency approach assumes
that a single central system is responsible for the processing and temporary
storage of information. Its limited capacity must be shared between the processing
and the storage demands. Individuals with inefficient processes have a
functionally smaller storage capacity because they must allocate more resources to
the processes (Daneman & Carpenter, 1983; Daneman & Tardif, 1987).
Working memory capacity was measured in terms of operational capacity
dependent on the type of specific background task used in a particular domain
(Carpenter & Just, 1989). For example, the reading span test was used to measure
WM capacity as the largest size of the set of simple sentences from which a
subject can reliably recall the final words of all the sentences (Daneman &
Carpenter, 1983). Daneman and Tardif (1987) established that the reading span
was a measure specific to the language skills, not a measure of general workingmemory capacity, and it correlated significantly with reading comprehension
ability.
Although there obviously are systematic differences among individuals in
their working memory capacity for specific tasks, and these differences influence
performance when the person operates at the limit of his or her working memory
capacity, no single approach or hypothesis concerning the interpretation of
individual differences in WM capacity has received convincing empirical support.
Such differences could be strongly influenced by knowledge structures available
in long-term memory. Any WM span implicitly reflects an individual's knowledgeand experience in a domain, and this knowledge inevitably influences his or her
performance in both processing and storage parts of the task (e.g., Hulme,
Maughan, & Brown, 1991; Hulme, Roodenrys, Brown, & Mercer, 1995). WM
span measures thus could be used as predictors of the person’s performance in the
corresponding domain rather than measures of his or her true general WM
capacity. It is practically impossible to eliminate the influence of the person’s
knowledge base when meaningful tasks are involved in WM span tests. From this
point of view, approaches that focus on connections between the content and
operation of working memory and long-term memory could be more relevant and productive.
-
8/18/2019 Cognitive Load Factors.pdf
15/116
Slava Kalyuga6
Simple chunking mechanisms provide an example of using long-term
memory structures in transforming the content of working memory. The chunk isa familiar unit of information based upon previous learning. For example, it could
be difficult to remember and recall a string of random letters like
B,B,C,C,I,A,A,B,C,F,B,I, unless we chunk them together into BBC, CIA, ABC,
FBI. In this case, we use our prior knowledge stored in LTM to reduce the number
of elements to a manageable four chunks. The same method could be used with
the following string of numbers: 1,9,1,4,1,9,4,5,1,9,9,6,2,0,0,1. Another common
example of chunking in language comprehension is the way we chunk letters into
familiar words, and words into familiar phrases. An STM capacity estimate of
around seven units (Miller, 1956) actually indicates the number of chunks rather
than total amount of information stored in STM. This mechanism explains how
we manage to get around the information-processing bottleneck created by our
limited working memory capacity, and to learn the enormous amount of
knowledge in our LTM.
People can be trained to effectively increase their memory capacity to an
amazing degree through extensive training in chunking and re-chunking
information into meaningful units using their prior knowledge stored in LTM. The
skilled memory theory (Chase & Ericsson, 1982) claims that people developmechanisms that enable them to use a large and familiar knowledge base to
rapidly encode, store, and retrieve information within the area of their expertise
and thus circumvent the working memory capacity limitations. As a result, experts
possess an enhanced functional working memory capacity in domains of their
expertise (Ericsson & Staszewski, 1989).
Available domain-specific knowledge enables experts to quickly encode and
retain large amounts of information in LTM. Such LTM storage and retrieval
operations speed up with practice and are comparable with STM encoding and
retrieval, resulting in experts' superior task performance and superior recall forfamiliar materials (the skilled memory effect; Ericsson & Staszewski, 1989). For
example, expert mnemonists can increase their digit spans far beyond the limit of
Miller's seven plus-or-minus two digits. They use familiar chunks of knowledge
in LTM to encode new information in an easily accessible form. Ericsson and
Staszewski (1989) described a person who expanded his digit span to 84 digits by
grouping them into short sequences and encoding them in terms of, familiar to
him, athletic running times, dates, and ages. He nevertheless operated under the
constraints of limited-capacity STM: the size of digit groups never exceeded five
digits, and these groups never were clustered in supergroups with more than fourgroups in a supergroup.
-
8/18/2019 Cognitive Load Factors.pdf
16/116
Basic Architecture of Human Cognition 7
In the WM model of Carpenter and Just (1989), the operation of WM during
reading comprehension is also based on relations between WM and LTM. In thismodel, WM consists of currently active pointers to LTM structures and partial or
final products of processing. A reader stores the theme of the text, the general
representation of the situation, the major propositions from preceding sentences,
as well as a representation of the sentence he or she is currently reading (Just &
Carpenter, 1992). When dealing with an unstructured series of words, we can
usually recall only six or seven unrelated words in order (according to our STM
span). Skillful readers, on the other hand, can recall and understand long
sentences (about 77% of words in up to 22-word sentences) because they use
internal structures in LTM to circumvent WM limitations. Thus, sentence
comprehension can be considered as recoding (chunking) incoming symbols into
some structure (Carpenter & Just, 1989).
Ericsson and Kintsch (1995) further developed these ideas into the theory of
long-term working memory (LT-WM). In this theory, LTM knowledge structures
associated with components of working memory form a LT-WM structure that is
capable of holding virtually unlimited amount of information. Some additional
mechanisms were introduced for overcoming the effects of interference in experts'
use of LTM knowledge for storage and retrieval of newly encoded informationwere introduced. The proposed mechanism of LT-WM operation involves cue-
based retrieval of information from LTM. The encoding method can be based on a
specifically constructed retrieval structure, an elaborated existing memory
structure, or a combination of the two. Skilled performance depends on domain-
specific knowledge structures relevant to particular tasks, and, consequently, there
are individual differences in the operation of LT-WM for a given task (Ericsson &
Kintsch, 1995).
KNOWLEDGEREPRESENTATIONS
Our knowledge base in LTM profoundly influences cognitive processes in
most situations. Therefore, forms of knowledge representations are critical for
understanding human cognition. Several major ways of representing the meaning
of information in memory have been suggested: propositional representations
(semantic networks), procedural representations (production systems), and
schemas. Analogical representations or mental models (Rumelhart & Norman,
1983) can be generally considered as schemas. The concept of a proposition denotes the primitive unit of meaning, or a smallest unit of knowledge about
which it is possible to make the judgment, true or false. Networks of such
-
8/18/2019 Cognitive Load Factors.pdf
17/116
Slava Kalyuga8
interconnected units can be used to represent the meaning of sentences and
pictures.
Newell and Simon (1972) suggested that knowledge could be represented by
a set of conditional rules or productions condition→ action. The production rules
are stored in long-term memory and are retrieved and used in working memory.
The current contents of working memory are matched against the conditions of all
the production rules in long-term memory. Whenever the conditions of a rule
occur in working memory, the rule is triggered and its action is carried out. Action
of the rule can change the contents of working memory and determine which rule
is triggered next. Thus, the principles determining how one rule is followed by
another are built into the rules themselves.
One of the most advanced theories based on the idea of production rules, the
ACT* theory (Adaptive Control of Thought; Anderson, 1983), or its updated
version ACT-R (R for rational; Anderson, 1993), suggest a separate type of long-
term memory for production rules (for skills) in addition to the declarative
memory (propositions, images, and other representations for facts and
experiences). The items in these memories can vary in their degree of ‘activity’. If
the contents of working memory match more than one rule in procedural memory
then whichever is the most active is triggered.The concept of a schema, originally discussed by Bartlett (1932), came into
cognitive psychology from research in artificial intelligence (Minsky, 1975;
Bobrow & Winograd, 1977). Schemas generally represent the object as a set of
attributes (slots). Schemas abstract generalizations about objects from specific
instances, encode general categories and typical features. They may include not
only propositions, but also perceptual features (for example, spatial images) and
stereotypic sequences of events. Schemas may have slots with fixed or variable
values; slots with variable values usually have some default or most probable
values.The most important features of schemas are stable patterns of relationships
between variables (slots). Each schema contains information about some class of
structures. When particular values are assigned to slots of a schema, a schema-
based knowledge structure could be obtained in the form of concepts,
propositions, etc. The obtained knowledge structures could be more general or
more specific depending on those values. Multiple schemas can be linked together
and organized into sophisticated hierarchical structures where one schema can
form part of a more complex schema.
Schemas may represent knowledge of all kinds and levels: from individualletters (allowing us to recognize different variations of handwritten letters) to
complex electronic or organizational systems, behavioral patterns, visual and
-
8/18/2019 Cognitive Load Factors.pdf
18/116
Basic Architecture of Human Cognition 9
auditory perceptual images. For example, our schema for a human face includes
slots for eyes, a nose, a mouth, ears, etc. These components are arranged in acertain configuration that is not a rigid one. However, some general requirements
should be met: the nose and eyes should be located above the mouth; eyes should
be located above the nose on different sides of it, etc. This general schema allows
us to recognize instances of human faces in limitless situations, including some
peculiar forms of visual arts.
A student’s schema for solving linear algebraic equations of the type ax = b
may include three slots: 1) a number b on the right hand side of the equation; 2) a
number a on the left hand side of the equation; and 3) the division operation:
divide the content of the first slot on the content of the second slot. For less
experienced students, the schema may include the operation of dividing both sides
of the equation on the same number a. In this case, the schema would contain
slots for both parts of the equation, the dividing number a, and the division
operation.
For an example of higher-level schematic knowledge representations,
consider the technical domain that includes knowledge about various technical
objects (e.g., tools, devices, machines, technological procedures). This variety of
knowledge in any technical area could be represented with different levels ofspecification: from descriptions of general features to specific details. A
schematic framework for representing knowledge about a technical object may
include three main interconnected components that could be referred to as
functional, operational, and structural descriptions. Any technical object could be
characterized by some functions or purpose it was designed for (what is this
object for?), processes utilized in the object’s operation (how does it operate?),
and the object’s internal structure including links between its components (what
does it consist of?). To explain an object’s operation means to explain why a
given set of linked parts performs specific functions utilizing certain processesduring operation. A learner should establish connections between functional,
operational, and structural components of the object’s description in order to
understand how it works (Kalyuga, 1984; 1990).
Gruber and Russell (1996) suggested similar classes of an artifact description:
structure (the physical and/or logical composition of an artifact in terms of the
composition of parts and connection topologies), behavior (something an artifact
might do in terms of observable states or changes), function (effect or goal to
achieve by artifact behavior), requirements (prescriptions concerning the
structure, behavior, and/or function that the artifact must satisfy), and objectives(specifications of desired properties of the artifact other than pure functions, such
-
8/18/2019 Cognitive Load Factors.pdf
19/116
Slava Kalyuga10
as cost and reliability). Requirements and objectives could be generally included
into the functional description (as functional requirements and general functions).
functions of the object
alternativecombinations ofprocesses realizinga set of functions
alternative technicalsolutions realizing acombination ofprocesses
Figure 2. General schematic structure of technical knowledge.
Each of above aspects of technical knowledge may have different levels of
generalization. It is possible to describe an object in very general terms (a global
level or general overview) or in more details with different levels of specification.
When combined together, all aspects, components, and levels of the description of
a technical object create a sophisticated multilevel hierarchical schematicstructure of technical knowledge. In an abstract form, this structure could be
represented by the graph in Figure 2. Three levels of description are shown for
-
8/18/2019 Cognitive Load Factors.pdf
20/116
Basic Architecture of Human Cognition 11
functions, processes, and structural components of a technical object. Simple and
superficial knowledge about the object may include only isolated componentscorresponding to the upper rows in the depicted clusters of knowledge elements.
Further deepening of knowledge requires establishing relations between these
components and adding elaborated knowledge on more specific levels of
description.
There are many definitions of schemas depending on the theoretical
perspective of the researcher. It is practically impossible to precisely describe the
schematic knowledge structures held by an individual. As Norman (1983) noted,
"we must … discard our hopes of finding neat, elegant mental models, but instead
learn to understand the messy, sloppy, incomplete, and indistinct structures that
people actually have" (p. 14). In general, a schema can be described functionally
as a cognitive construct (an organized knowledge structure) that allows people to
classify information according to the manner in which it will be used (e.g., Chi,
Glaser, & Rees, 1982; Sweller, 1993). Such organized knowledge structures
represent a major mechanism for extracting meaning from new information,
acquiring and storing knowledge, circumventing the limitations of working
memory, increasing the strength of memory, and recalling information. They
impose an organization on the information, guide retrieval, and provideconnections to prior knowledge.
In schema theory, the process of learning can be considered as encoding new
information in terms of existing schemas, as schema modification, or as the
creation of new schemas. The creation or modification of a schema is based on
conscious cognitive processing of information in working memory. In a more
general context, schema acquisition could be regarded as an example of a non-
linear process where the schema emerges from lower-level components during
learning or practice. As a cognitive unit, the schema represents a higher level of
organization than just a simple collection of lower-level components.The need for the emergence of higher levels of schema hierarchy could be
associated with general limitations of human information processing. In a wider
context, any qualitatively new level of a system emerges in a non-linear way as a
means to overcome the combinatorial barrier caused by immense number of
possible combinations of the variety of elements of the previous, lower level.
Examples of such processes are the emergence of the molecular level from atoms,
biochemical structures from molecules, or nerve impulses from biochemical
structures (Scott, 1995; Turchin, 1977). Structured neuronal groups might
represent the qualitatively new biological level of conscious cognitive functioning(Edelman, 1992). On the psychological level of description, our abstract high-
level schematic knowledge representations in long-term memory (and
-
8/18/2019 Cognitive Load Factors.pdf
21/116
Slava Kalyuga12
corresponding intellectual abilities associated with operating such structures)
might have emerged as a means of overcoming the combinatorial barrier underconditions of limited processing capacity.
Because a schema is treated as a single unit in working memory, such high-
level structures require less working memory capacity for processing than the
multiple, lower-level elements they contain, making the working memory load
more manageable. Our abilities to construct and use higher-order hierarchical
cognitive configurations of knowledge structures in long-term memory might
have emerged during evolution as a way of providing structure to the elements
being dealt with by working memory (Sweller, 2003, 2004). Thus, by allowing
multiple elements to be treated as a single element in working memory, long-term
memory schematic structures may have, as one of their functions, the reduction of
working memory load.
Specific schema selection in a particular situation is usually automated and
quick. Our first impression about an unfamiliar person (which is said to be the
most important), our comprehension of movies, fiction, music, humor, or art is
guided by our acquired domain-specific schematic knowledge structures. Schemas
guide our recall of different past events. Our memory usually retains the gist of a
situation or event according to our schematic knowledge of it. The schema defineswhat is encoded and stored. When recalling the event, we create schema
instantiations filling in missing information and inferring unavailable components
using our schemas for the event. Sometimes such recall may produce various
distortions to fit our schemas or expectations (e.g., recall scenes of court
procedures from movies and fiction stories with witnesses remembering details
they have not actually witnessed).
The structure of the schematic knowledge can be empirically assessed, for
example, by asking students to group problems into clusters on the basis of
similarity; to categorize problems after hearing only part of the text; to provideanswers to problems when content words have been replaced by nonsense words;
to solve problems when material in the text is ambiguous; to contrast problems
using a nominated principle; to recall problems that were presented earlier; to
identify which information within problems is necessary and sufficient for
solution; and to classify problems in terms of whether the text of each problem
provides sufficient, missing or irrelevant information for solution (‘text editing’)
(Low & Over, 1992).
Previously acquired schematic knowledge structures are the most important
factor that influences learning new material. A student’s understanding of aninstruction means instantiation of appropriate familiar schemas that would allow
her or him to assimilate new information with prior knowledge. A failure to
-
8/18/2019 Cognitive Load Factors.pdf
22/116
Basic Architecture of Human Cognition 13
comprehend instruction might be caused by the lack of any appropriate schemas
in LTM, by the lack of sufficient cues in the situation to elicit a schema, or by thelearner applying a different schema than that intended by the instruction.
Students' preexisting schemas often resist change: everything that cannot be
understood within the available schematic frameworks is ignored or learned by
rote. It is important to build new knowledge on top of students existing schemas
or help them to acquire an appropriate schematic framework by relating it to
something already known. Useful instructional techniques could be analogies or
diagrams, to establish links with existing knowledge, and advance-organizers to
elicit or activate existing relevant schemas or provide new ones (concept maps,
headings, summaries at the start of chapters, etc.).
Similar to production systems, a schema-based approach to representing
knowledge provides a general framework that can be instantiated by specific
theories. In all schema-based models of cognitive architecture, schemas are
matched to the contents of working memory for recognition. If a schema is
partially matched by the information in working memory, it will create further
information to complete the match. Schemas instantiated in working memory
could be modified or reorganized, then placed back into long-term memory and
serve as a new, more specific schema for further recognition.Schema theories do not differentiate between procedural and declarative
knowledge. Instructions for actions may be produced by matching a schema to a
situation and adding missing pieces of information. For example, recognizing a
situation as a schema for solving simple linear algebraic equation and recognizing
values of corresponding slots would provides directions for necessary operations.
Production rules could be considered as a form of schematic knowledge. There is
a tendency towards converging production system and schema-based approaches
within those approaches. For example, Koedinger and Anderson (1990) integrated
two approaches by constructing a computational (production-system-style) modelof solving geometry problems using schema-based knowledge structures. The
schemas (‘diagram configuration schemas’) were described as clusters of
geometry facts that were associated with a single prototypical geometric image.
In this book, schematic knowledge structures will be used as the basic unit
and prevailing form of knowledge representations in long-term memory.
Accordingly, the approach to human performance that is based on studies of
schematic knowledge structures will be further referred to as a schema approach.
-
8/18/2019 Cognitive Load Factors.pdf
23/116
Slava Kalyuga14
PROBLEM SOLVING AND THE NATURE
OF HUMAN EXPERTISE
All of our purposeful cognitive activities can be considered as problem
solving. Initially, in the 1950s and 1960s, most research studies on problem
solving were concerned with knowledge-lean task domains that required no
special training or background knowledge (for example, the famous ‘Tower of
Hanoi’ task, various puzzles, etc.). The study of such tasks led to the formulation
of a general theory of human problem solving (Newell & Simon, 1972). In this
theory, a problem contains three main components: a given state, a goal state, anda set of operators for transforming the given state into the goal state. Problem-
solving activity is considered as a search in the problem space that consists of
separate problem states (knowledge states). The task of problem solving is to find
a sequence of operators that can transform the initial state into a goal state within
the problem space.
So-called weak methods could be used in solving knowledge-lean tasks. We
often use general heuristics (rules of thumb) for choosing necessary sequences of
operators. For example, the difference reduction heuristic suggests choosing
operators that maximally reduce the difference between the current state and thedesired state. However, this method does not guarantee success in solving the
problem, and more advanced methods are usually adopted. Forward chaining
starts with the initial problem state, and a selected heuristics-based operator is
applied, and then the strategy repeats. Backward chaining starts with the desired
solution state, and a heuristically chosen operator is applied in reverse. A
subgoaling strategy chooses an operator and forms a subgoal to find a way to
change the current state so that the chosen operator could be applied. The method
of solving by analogy uses the structure of the solution to one problem to obtain
the solution to another problem (van Lehn, 1989).
The weak methods are often used in combined forms. For example, the GPS
(General Problem Solver) production system-based mechanism developed by
Newell and Simon (1972) uses the means-ends analysis method. This method
consists of looking for an operation that reduces the difference between the goal
and initial state, setting up subgoals whose solution provides a solution of the
original goal, and building up a hierarchical plan to solve a problem. Means-ends
analysis thus combines forward chaining and operator subgoaling: the current
state of problem solving is compared to the goal state and actions are selected to
reduce the difference (van Lehn, 1989).
-
8/18/2019 Cognitive Load Factors.pdf
24/116
Basic Architecture of Human Cognition 15
In the early 1980s, experiments with puzzle problems demonstrated that, even
after extensive problem solving by means-ends analysis, participants still did notinduce a simple solution rule. Rule induction occurred only after some additional
information had been provided (Mawer & Sweller, 1982; Sweller & Levine, 1982;
Sweller, Mawer, & Howe, 1982). Empirical evidence was obtained that extensive
practice in conventional problem solving was not an effective way of acquiring
schemas that are required to successfully solve corresponding problems (Owen &
Sweller, 1985; Sweller & Cooper, 1985; Sweller & Levine, 1982; Sweller,
Mawer, & Ward, 1983). These studies suggested that a means-ends strategy could
inhibit schema acquisition.
A means-ends strategy focuses attention on specific features of the problem
situation required to reach the goal and on reducing difference between current
and goal problem states by selecting proper operators. Maintaining subgoals and
considering alternative solution pathways are cognitively demanding mental
activities that might result in working memory overload. Additionally, these
activities are unrelated to learning solution schemas that are critical for successful
future problem solving. They reduce resources devoted to learning other
important aspects of problem structure. For example, studies of two-step problems
demonstrated that cognitive load might be very high at the subgoal stagesresulting in more errors than on the final goal stage (Ayres & Sweller, 1990).
Sweller & Levine (1982) demonstrated rapid learning of maze problem-
solving schemas when the specific goal state was unknown, and it was not
possible to reduce differences between the goal and given problem states. Sweller,
Mawer, and Ward (1983) found that using a means-ends strategy can actually
impair learning, and that less directed exploration of the problems facilitated
acquisition of useful problem schemas. They used simple physical and geometry
problems without a specific goal stated (goal-free problems such as Calculate the
value of as many variables as you can) and observed enhanced development of problem-solving skills. Owen and Sweller (1985) found that problem solvers
using a means-ends strategy made significantly more errors than those using other
methods, supposedly due to the working memory load associated with means-
ends analysis.
In a theoretical investigation of the cognitive (working memory) load
phenomena, Sweller (1988) constructed and analyzed a computational model of
cognitive processes based on a theory of production systems (Newell & Simon,
1972). The model operates by matching elements on the condition side of each
production to elements in a working memory (for example, the knowns,unknowns, goal, possible equations or theorems). If the condition side of a
production is matched by some of the elements in working memory, the
-
8/18/2019 Cognitive Load Factors.pdf
25/116
Slava Kalyuga16
production can fire, and its action alters the content of working memory allowing
other productions to fire. The cognitive load in such a model could be measuredconsidering the number of statements in working memory, the number of
productions, the number of cycles to solution, and the total number of conditions
matched. Application of this model to novice cognitive behavior in various
instructional procedures provided evidence of the heavy cognitive load associated
with a means-ends strategy compared with a forward-working goal-free strategy.
It also explained why the use of goal-free problems or worked examples was more
effective means of acquiring schemas than conventional problem solving
(Sweller, 1988; Ayres & Sweller, 1990).
Since the late 1970s, the research focus in problem solving shifted to studying
knowledge-rich task domains (algebra, geometry, physics, thermodynamics,
computer programming, chess, bridge, etc.) that required an essential knowledge
base as a prerequisite. Problem solving in such domains has additional
complexities. Representation of a problem requires a great deal of domain
knowledge, and operators that are usually used are domain-specific operators. The
central questions of research in such domains are how is knowledge used to build
up a problem representation and how does it influence the actual problem-solving
process (Reimann & Chi, 1989).In semantically rich domains, problem solving involves searching one's
knowledge of the domain in order to find the operators for solving the problem.
Research on the use of knowledge in problem solving suggests that people use
two types of domain-specific knowledge to solve problems: declarative
conceptual knowledge (knowledge of the principles of the domain) and procedural
knowledge (knowledge how to perform cognitive activities). Procedural
knowledge may be described as a set of production rules that define actions for
achieving goals (Anderson, 1983). Conceptual and procedural knowledge in
problem solving can be considered as organized into problem schemas. They formthe general framework of knowledge that corresponds to classes of problems.
Problem solving in complex domains thus can be viewed as finding an
appropriate problem schema in long-term memory and filling in this schema with
the specific parameters of the problem (Chi, Feltovich, & Glaser 1981; Chi &
Glaser, 1985). The problem schema determines what conceptual knowledge is
used to build a representation of the problem statement, and what procedures are
used to solve the problem. Much research in knowledge-rich domains is
concerned with the differences between expert and novice problem solving. It has
become evident that experts' behavior is mostly determined by their knowledge base. Therefore, the learning processes in which the experts acquired this
knowledge are critical in explaining their performance. The focus of attention in
-
8/18/2019 Cognitive Load Factors.pdf
26/116
Basic Architecture of Human Cognition 17
the later studies shifted to learning theories as theories of the acquisition of
expertise (Van Lehn, 1989).A considerable number of recent research studies in cognitive psychology
have been concerned with the investigation of the structures and processes of
human competent performance as a consequence of learning. It is generally
accepted that development of expert performance is a very complex process
involving a great deal of deliberate effort. Studies have shown that at least 10
years of practice are necessary for people in various fields of culture and science
to reach superior levels of skilled performance (Ericsson & Charness, 1994;
Ericsson, Krampe, & Tesch-Romer, 1993; Simon & Chase, 1973).
Expert performance is usually acquired during extensive deliberate practice in
a domain. Such practice should be organized at an appropriate and challenging
level of difficulty, allow steady skill refinement by repetition and error correction,
and provide informative feedback to the learner (Ericsson et al., 1993; Ericsson &
Lehman, 1996). Competent expert performance generally requires well-developed
cognitive skills, well-organized structures of knowledge, and self-regulatory
performance control or metacognitive strategies (Glaser, 1990).
Well developed cognitive skills as a major characteristic of expert
performance require functional (related to conditions of applicability) automatedknowledge (Fitts & Posner, 1967; Anderson, 1983, 1993; Klahr, Langley, &
Neches, 1987). The process of skill learning is claimed to occur in several stages.
In the first stage (cognitive stage), a description of the procedure is learned in the
form of declarative knowledge. In the second stage (an associative stage), the
declarative information is transformed into a procedural form, and a set of
procedures for performing the skill is acquired. Such a process of converting
declarative knowledge into a procedural form is called proceduralization. In this
stage, two forms of knowledge (declarative and procedural) coexist. In the third
stage (autonomous stage), the skill becomes more rapid and automatic (Anderson,1983).
When knowledge becomes automated during the development of proficiency,
conscious processing capacity can be concentrated on higher levels of cognition.
Automated performance requires a limited attentional capacity. Processing that
once demanded active control, after extensive practice can become automatic,
freeing limited attentional capacity for other tasks (Kotovsky, Hayes, & Simon,
1985; Schneider& Shiffrin, 1977; Shiffrin & Schneider, 1977). For example,
while the use of declarative knowledge initially requires much conscious
cognitive processing, automatic application of proceduralized knowledge freesworking memory and allows its capacity to be used for the processing of new
knowledge. Intensive training on certain procedural elements of a task can make
-
8/18/2019 Cognitive Load Factors.pdf
27/116
Slava Kalyuga18
them more automatic and free cognitive capacity for other more creative elements.
This is especially important for transfer of training (Cooper & Sweller, 1987;Howell & Cooke, 1989). Automated lower level routine procedures enable
learners to concentrate on finding new ways of applying their knowledge in
unfamiliar situations.
The process of learning could be considered as the acquisition of new
schemas that eliminate the need to apply weak problem-solving methods (e.g.,
means-ends analysis) to solve future similar problems. The result is a shift from a
novice strategy of working backward from the goal using means-ends analysis
and subgoaling, to a more expert knowledge-based strategy of working forward
from the initial state to the goal. Availability of a sufficient set of relevant
domain-specific schematic knowledge structures that could be used in performing
tasks is an important feature of a competent human performance. With experience
in a domain, knowledge is organized into larger interconnected aggregate
structures that explain the skilled performance of experts (Chi, Glaser, & Farr,
1988; Lord & Maher, 1991).
Under a schema-based approach, learning can take different forms. Schema
evolution is a central mechanism in the development of expertise. New
information could be encoded in terms of existing schemas without involving anynew schemas. Schemas evolve as they are applied and utilized as learner
experience in the domain increases. Another form of learning is restructuring or
creation of new schemas. In order to explain how schemas can be built up through
experience, Rumelhart and Norman (1981) proposed a mechanism of learning by
analogy. Initially, a new schema could be created by modeling it on an existing
schema followed by a process of refinement (tuning). When a learner encounters a
new situation in a familiar domain, she or he tries to interpret it using existing
schemas. If none of them suits the situation, the best existing schema can serve as
a model from which to start the tuning process. The characteristics of this modelthat do not contradict the new situation are carried over into the new schema.
Planning and self-regulatory (metacognitive) skills allow experts to control
their performance, assess their work, and predict its results. These self-regulatory
skills are an important condition of expert ability to use the available knowledge
base (Chi, Bassok, Lewis, Reimann, & Glaser, 1989; Larkin, McDermott, Simon,
& Simon, 1980). Chi et al. (1989) proposed that students learn and understand
examples of problem solutions via the self-explanations they give while studying.
Students who are successful problem-solvers tend to study example exercises by
explaining and providing justifications for each action and relating these actionsto the principles and concepts of the domain. These students read the example
with understanding and self-monitoring. Students who are less successful
-
8/18/2019 Cognitive Load Factors.pdf
28/116
Basic Architecture of Human Cognition 19
problem-solvers do not connect their explanations (if any) with their
understanding of the principles of the domain. During problem solving, successfulstudents may use examples for a specific reference, whereas less successful
students repeat them in search for ready-made solutions. The level of performance
significantly depends on the metacognitive skills that learners bring to the task.
Cognitive studies of human performance and learning have the potential to
greatly influence instructional design principles. Generally, instructional design
should minimize learners' involvement in activities that overburden their limited
working memory and be adapted to the learners’ available knowledge structures
in long-term memory. Appropriate design of instruction should be based on the
knowledge of characteristics of expert performance, expert-novice differences,
and the transition process from novice to expert. Cognitive models of expert
performance and their influence on the design of instruction are considered in the
following chapter.
-
8/18/2019 Cognitive Load Factors.pdf
29/116
-
8/18/2019 Cognitive Load Factors.pdf
30/116
Chapter 2
COGNITIVE STUDIES OF EXPERT-NOVICE
DIFFERENCES AND DESIGN OF INSTRUCTION
SCHEMA-BASED APPROACH TO STUDYING
EXPERT PERFORMANCE
The purpose of cognitive studies of human expertise is to identify the
cognitive structures and processes responsible for skilled performance. Expert
performance has been studied in a variety of domains, for example, chess (de
Groot, 1965), physics (Chi, Feltovich, & Glaser, 1981; Larkin, McDermott,
Simon, & Simon, 1980), programming (Anderson, Boyle, & Reiser, 1985) and
radiology (Lesgold, Rubinson, Feltovich, Glaser, Klopfer, & Wang, 1988), to
name just a few. Various techniques and approaches have been applied to find out
the organization of experts' knowledge, the characteristics of their understanding,
information processing requirements and the nature of competency in such areas
as chess (Chase & Simon, 1973; Simon, 1979), geometry (Greeno, 1977),
genetics (Smith & Goodman, 1984), physics (Larkin & Reif, 1976), electronic
troubleshooting (Brown & Duguid, 1989; Forbus & Gentner, 1986; Gitomer,
1988; Lesgold & Lajoie, 1991; Morris & Rouse, 1985; Perez, 1991; Rasmussen,
1986; Swezey, Perez, & Allen, 1988; Tenney & Kurland, 1988; Wiggs & Perez,
1988), and mechanical troubleshooting (de Kleer & Brown, 1983, 1984; diSesssa,
1983; Forbus, 1984; Hegarty, 1991; Hegarty & Just, 1989; Heller & Reif, 1984;
Miyake, 1986; Reif, 1987; Stanfill, 1983; White, 1983; White & Frederiksen,
1986).As discussed in the previous chapter, schemas are a major type of knowledge
representation in long-term memory that reflects prototypical features of objects,
-
8/18/2019 Cognitive Load Factors.pdf
31/116
Slava Kalyuga22
situations, and events. To understand or interpret incoming information, the
human cognitive system matches this information with existing schemas(Rumelhart & Norman, 1983). In general, studies of expert-novice differences
demonstrate that expertise is not so much a function of superior problem-solving
strategies or a better working memory, but rather experts have a better domain-
specific schematic knowledge base.
Chunks have played an important role in the development of the
understanding of expert-novice differences. Since Miller's (1956) finding that
short-term memory is limited to approximately seven units, or chunks, of
information, a chunk has served as a unit of measurement for memory capacity. A
chunk can be considered as a generalized example of a schema. De Groot (1965;
1966) was one of the first psychologists who investigated expert-novice
differences and demonstrated that expertise can be explained by the enormous
amounts of knowledge that experts can access. In his classic studies, chess players
had to reconstruct the positions of chess pieces on a board, after a brief exposure
(5 seconds). De Groot's findings that chess masters could recall many more pieces
from briefly exposed real chess positions than novices was explained by masters
having larger chunks. Chase and Simon (1973) noticed that experts placed chess
pieces on the board in groups that represented meaningful configurations. Theexperts did not show superior performance when random placements of the chess
pieces were used.
Egan and Schwartz (1979) studied expertise in electronics with a
methodology similar to that used by Chase and Simon (1973) in studying chess
expertise. They found that experts could reconstruct large circuit diagrams from
memory recalling them in chunks of meaningfully related components. The
experts were better than novices at recalling meaningful (not random) circuit
diagrams. The size, rather than number, of recalled chunks increased with study
time. Chase and Ericsson (1982) further suggested that the superior memory ofchess masters and other experts was due to possession of schema structures with
specific slots filled in with the index information that served as retrieval cues. The
material could be recalled by reading out the contents of these slots and selecting
schemas that corresponded to familiar stimuli.
The schema-based approach was successfully used to explain various
phenomena related to expert performance and differences between experts and
novices (Chi et al., 1981; Reimann & Chi, 1989). For example, in the domain of
physics, experts' categories were based on the principles of mechanics
(conservation of energy and momentum, etc.), whereas novices' categories were based on objects and surface features stated in each specific problem (incline
plane, spring, etc.). In the case of an object being balanced on an inclined plane,
-
8/18/2019 Cognitive Load Factors.pdf
32/116
Cognitive Studies of Expert-Novice Differences and Design of Instruction 23
the experts saw it as an example of a class of problems requiring a balance-of-
forces approach, while novices saw it as an inclined planes problem type. Thefailure of a novice to solve this problem may result from the fact that different
incline plane tasks may require different approaches (based on balance of forces,
energy conservation, etc.), and the presence of the incline plane alone does not
determine the appropriate approach.
One of the reasons for novices' difficulties in problem solving is that they
activate only lower-level schemas that incorporate only surface aspects of the
problem, whereas experts activate higher-level schemas that contain information
critical to the problem solution (Chi & Glaser, 1985). Thus, experts categorize
problems in terms of deep structures such as the laws used to solve the problems,
while novices categorize problems based on surface structures such as common
physical attributes. The same problem may elicit different schemas for experts
than for novices.
Schematic knowledge structures in long-term memory effectively provide
necessary executive guidance during high-level cognitive processing (Sweller,
2003). Without such guidance and in the absence of external instructions, people
usually resort to random search or weak problem-solving methods such as means-
ends analysis (a gradual reduction of differences between current and goal problem states). Such methods are cognitively inefficient and time consuming.
They may impose a heavy working memory load interfering with construction of
new schemas (Sweller, 1988).
In contrast, when experts in a domain encounter a familiar problem situation,
they rapidly retrieve appropriate previously acquired schemas from long-term
memory and apply them in a cognitively efficient way (Chi, et al., 1981; Larkin,
et al., 1980). Schemas allow them to categorize different problem states and
decide the most appropriate solutions. Due to their available knowledge base in
long-term memory, experts are able to avoid cognitively inefficient mentalactivities and perform with greater accuracy and lower cognitive loads.
Schematic knowledge structures can be described functionally by indicating
how a person with a specific level of a schema acquisition would act in relevant
problem situations. For example, without any schematic knowledge of procedures
for solving the equation 4x + 2 = 3 and in absence of any guidance, a student will
treat each symbol separately and may try to use a means-ends analysis approach
by reducing differences between a current problem state and the goal state (x = ?)
or attempt to apply various random operations to the numbers.
With some previously acquired knowledge of an appropriate procedure,another student may immediately proceed to subtract the coefficient 2 from both
sides of the equation: 4x + 2 – 2 = 3 – 2. The whole combination of elements (e.g.
-
8/18/2019 Cognitive Load Factors.pdf
33/116
Slava Kalyuga24
4x + 2) will be treated as a meaningful single unit or chunk. If a student practiced
considerably with this kind of equations, the schema for this procedure may beautomated and her or his first solution step will be 4x = 1. Another, even more
experienced student may have all the relevant solution procedures well learned or
automated and would write the final answer (x = 1/4) almost immediately.
Similar examples of expert-novice differences could be demonstrated in other
areas. Each symbol in a wiring diagram could be treated as a separate element by
a novice electrician, while an experienced professional would see the whole
diagram as representing a complete system. For a foreign language non-speaker, a
printed text might look as a collection of unfamiliar symbols, while fluent native
readers would be able to make sense out of the whole text. They would treat
words or even combinations of words as single elements.
By combining multiple elements of information into a single chunk in
working memory, long-term memory schemas allow experts to avoid processing
overwhelming amounts of information and to effectively reduce working memory
load during high-level cognitive processing. In addition, experts are also able to
bypass working memory limitations by having many of their schemas highly
automated due to extensive practice. Human cognitive architecture has evolved in
a way that information processing changes significantly as this information becomes more familiar to an individual (Sweller, 2003). Schematic knowledge
structures held in long-term memory significantly influence the content and
characteristics of working memory by effectively transforming it into long-term
working memory (Ericsson & Kintsch, 1995).
An expert’s routine problem solving in a familiar domain usually involves a
selection of an appropriate schema, adapting it to the problem, and executing the
solution procedure. Often it occurs as a direct recognition early in the perception
of the problem (Chi, Feltovich, & Glaser, 1981). Non-routine problem solving
includes additional procedures such as search (when more than one schema isapplicable to the situation) or combining the schemas (when no one schema will
cover the whole problem) (Larkin, 1985). Substantial evidence has accumulated
that a schema theory of problem solving can be successfully used to explain
experts' performance in various task domains (Reimann & Chi, 1989).
Building a problem representation is a key process in problem solving
(Larkin, 1985; McDermott & Larkin, 1978, Simon & Simon, 1978). It has been
found that experts spend more time on a qualitative analysis of the problem and
building explicit representations of the situation (for example, by drawing the
diagrams of causal relationships between the objects). Experts also form moreabstract and enriched representations than novices do. For example, according to
Chi, Feltovich, and Glaser (1981), experts classify physics problems based on
-
8/18/2019 Cognitive Load Factors.pdf
34/116
Cognitive Studies of Expert-Novice Differences and Design of Instruction 25
abstract physics categories and principles, while novices do it according to surface
characteristics of the problem. Thus, the level of problem representation dependson the solver's problem schemas. An initial cue (first sentences in the problem
statement, etc.) may activate a particular schema that is then matched to the
problem. Any mismatch results in the rejection of that schema and triggering of
another schema.
Successful problem solving in technical domains depends on the solver's
schemas for the causal relations between components of a technical system which
allow mental simulations of the system operation (de Kleer & Brown, 1983;
Gentner & Stevens, 1983; Miyake, 1986). Providing learners with a causal
description of a device’s operation in addition to information about its
components was shown to enhance their ability to operate the device (Kieras &
Bovair, 1984; Mayer, 1989a).
Different types of schemas are appropriate for solving different types of
problems. At higher levels of skill, the choice of schematic knowledge types is
determined by higher level structures in which an expert's representations are
organized (Hegarty, 1991). Initially, problem schemas are specific to the
situations from which they were induced. With experience, they become indexed
by the general principles and problem solving becomes faster and takes less effort.Organization of the solvers' knowledge into large groups of chunks or schemas
decreases the demands on working memory and allows learners to activate
appropriate procedures. As soon as experts retrieve a problem schema, they
automatically access the procedures for solving the problem (Chi et al., 1981;
Smith, 1991).
The development of a problem representation can be viewed as the sequential
attempts of schema refining, which depends on the structure of the domain-
specific knowledge of the solver. This results in experts spending more time on
planning and using forward-working and efficient problem-solving processes(Reimann & Chi, 1989). Empirical studies in various domains have revealed that
problem-solving strategies are determined by the nature of the problem
representations, differences in the organization of knowledge, and the number of
domain-specific problem schemas that solvers have because of their experience in
a domain (Larkin, 1985; Lesgold, Feltovich, Glaser, & Wang, 1981).
Experts’ performance is schema-driven. Experts possess more domain-
specific schemas and can access and use them more efficiently than novices.
Experts work forward deriving the appropriate problem schema from the problem
statement. In contrast, novices’ performance is goal-driven. Novices work backward from the goal, searching for operators that will allow them to derive the
needed solution. However, working backwards is a default strategy that both
-
8/18/2019 Cognitive Load Factors.pdf
35/116
Slava Kalyuga26
experts and novices use when there is no schema for a given type of problems. In
a novel situation, experts use various types of general heuristics together withdomain-specific knowledge (Perkins, Schwartz & Simmon, 1991; Rist, 1989;
Schultz & Luchheud, 1991).
Thus, expert performance depends on available problem representations,
knowledge base (facts, concepts, principles, knowledge of a system and rules how
to use this knowledge), availability of appropriate domain-specific schemas,
general procedures (strategies, heuristics, algorithms), and relations among all
these elements (Hart, 1986; Lesgold and Lajoie, 1991). According to Chi, Glaser,
and Farr (1988), the main features of competent expert performance are:
1) domain-specificity (experts exhibit superior performance mainly in their
own domains);
2) perception of problem situations by large meaningful patterns;
3) high speed of performance;
4) superior well-organized long-term memory knowledge base;
5) deep-level and principle-based problem representations;
6) thorough qualitative analysis of problems; and
7)
strong self-monitoring skills.
COGNITIVE STUDIES OF EXPERT-NOVICE DIFFERENCES
AND INSTRUCTIONAL APPROACHES
Most studies of expertise have focused on discrete expert-novice differences
in solving specific tasks. Existence of a continuum between novices and experts
has been frequently ignored. As a result, our knowledge about the development of
expertise and about changes in cognitive processes as expertise is acquired islimited. Groen and Patel (1991) suggested four developmental levels: 1) novices
with no training in the domain (possessing only common sense knowledge and
everyday experience); 2) intermediates who have received some instruction in the
domain; 3) sub-experts who have expertise in a closely related domain (they may
also be viewed as intermediates); and 4) experts who are always correct in solving
routine problems and solve them by way of forward reasoning. It is impossible for
novices to learn expert approaches directly. When expert rules are taught to
beginners, they form isolated pieces of knowledge that are not retained for a long
period of time (Groen & Patel, 1991). Thus, an existing theory of expert performance cannot be applied directly to instruction, and theoretical models of
student transition from one level to another should be developed.
-
8/18/2019 Cognitive Load Factors.pdf
36/116
Cognitive Studies of Expert-Novice Differences and Design of Instruction 27
Expert routine problem solving is traditionally associated with using a
forward-working strategy; novices tend to work backward. In the case ofunfamiliar problems experts also use backward reasoning. The studies of Sweller
and his colleagues (Mawer & Sweller, 1982; Sweller & Levine, 1982; Sweller et
al., 1983) brought some understanding of when the switch occurs during the
development of expertise and what factors would facilitate the switch. It was
demonstrated that means-ends analysis might prevent the acquisition of problem-
specific rules because this method could leave no cognitive resources available for
meaningful learning.
Rule acquisition occurred or improved under conditions where subjects were
provided with information additional to the problem goal (for example, a set of
subgoals) or were given goal-free problems. Sweller et al., (1983) hypothesized
that the main factor responsible for this result was the kind of information a
learner focuses on during problem solving. If knowledge or schema acquisition is
an aim of problem solving, then the influence of the goal as a control mechanism
should be reduced.
In some studies, forward reasoning intermediate level medical students
performed more poorly then either experts or novices (Groen & Patel, 1991). This
result was explained by their dogmatic reliance on existing basic scienceknowledge. When students' knowledge contains misconceptions, forward
reasoning might be harmful for learning. If they reasoned backward, then the
misconceptions would be just temporary hypotheses. It was suggested that in such
cases an emphasis should be placed on self-explanations and testing their
adequacy (explanation-based learning) rather than on correct problem solving
(Groen & Patel, 1991).
Most of the experimental evidence in the area of expert-novice differences
was obtained by contrasting performance of experts and novices. Schoenfeld and
Hermann (1982) conducted one of the first longitudinal studies of the relationship between problem perception and expertise. Students' perceptions of mathematical
problems were examined before and after intensive training in mathematical
problem solving. It was demonstrated that novices sorted problems based on
surface components mentioned in the problem statement. After the training, they
sorted them in a more expert-like way according to the principles of problem
solution. Thus, problem perception and problem schemas on which such
perception is based changed as learners became more experienced in the domain.
With the development of expertise, problem schemas change in their level of
specificity (diSessa, 1983; Forbus & Gentner, 1986; Kaiser, Jonides, &Alexander, 1986). Initially induced from specific situations, they become more
general and indexed by the underlying principles (Chi et al., 1981). At higher
-
8/18/2019 Cognitive Load Factors.pdf
37/116
Slava Kalyuga28
levels of development, schemas may also change from qualitative to quantitative
representing relationships between components of problem situations more precisely (Forbus & Gentner, 1986; Hegarty, Just, & Morrison, 1988). As people
gain more experience with technical systems, they learn relations between their
common subsystems and learn to chunk components of systems into these
subsystems (Hegarty, 1991). New information is then assimilated into existing
sophisticated knowledge structures.
The learning mechanisms and strategies evolve as a learner becomes more
experienced (Langley & Simon, 1981). Lesgold et al. (1988) hypothesized that
early learning is perceptual and different from later cognitive learning. Experts
use schemas to interpret incoming information, intermediates often reshape their
perceptions to fit the schema, whereas novices completely rely on their
perceptions. The previously mentioned decline in performance at intermediate
levels can also be due to the shift from perceptual learning to cognitive schema-
based learning.
According to the triarchic/global/local architecture of expert cognition
(Sternberg & Frensch, 1992), when processing information from new domains, an
expert relies mostly on controlled, global processing. If information belongs to the
expert's narrow area of expertise, she or he relies mostly on automatic, local processing. Such local processing systems can operate in parallel, be automated,
and characterized by almost unlimited processing capacity. As expertise develops,
learned portions of processing procedures are transferred to a local processing
system. This enables experts to automate more processing and thus to free global
processing resources for dealing with new situations (Sternberg & Frensch, 1992).
However, experts may be inflexible in new situations because it is difficult to
reorganize an automated schema. Experiments with bridge players confirmed that
experts were more affected when new task demands required changing deep,
abstract principles rather than surface features. Novices were more affected bysurface changes than by deep, abstract changes (Sternberg & Frensch, 1992).
Nevertheless, Schraagen (1993) demonstrated that when domain-specific
knowledge is missing, experts could still maintain a more structured approach
than novices could by making use of more abstract high-level knowledge.
According to the theory of skill acquisition (Anderson, 1983), the instruction
in specific performance procedures must be preceded by the instruction in the
concepts, rules, and principles of how things work (declarative knowledge). In
addition to the theoretical principles, the ability to apply them in concrete
situations should be developed (Morris & Rouse, 1985). A procedural approachonly is not sufficient, because it is impossible to predict all possible situations in
advance, especially in complex domains like modern digital electronics. Thus,
-
8/18/2019 Cognitive Load Factors.pdf
38/116
Cognitive Studies of Expert-Novice Differences and Design of Instruction 29
training should combine knowledge of system principles with procedures of how
to use this knowledge in a specific context. In general, teaching expert performance might require a basic conceptual explanation of how things work,
practice in carrying out basic procedures, and variation in experiences for tuning
of procedural knowledge and the development of persistence and confidence
(Gentner & Stevens, 1983; Greeno & Simon, 1988).
Kieras and Bovair (1984) demonstrated that providing students with
conceptual models of a complex system prior to information on how to use that
system produced better recall, faster learning, and fewer errors in the operation of
the system. Combined structural and functional descriptions of system operations
are recommended for effective learning (Psotka, Massey, & Mutter, 1988).
However, specific instructional strategies should be based on the cognitive
requirements of particular tasks. The user does not always need a complete
knowledge of the system in order to be able to operate it.
For example, many experts in technical areas have a very limited
understanding of general physics principles but satisfactorily perform their duties.
If a device is simple, or a procedure is easily learned and practiced (e.g., a
telephone) there may be no need to provide a device model. The user may infer a
usable model without instruction (Kieras & Bovair, 1984). Limited underlyingknowledge and understanding of how certain functions are fulfilled are required
for operating and troubleshooting systems with simple functions. For more
complex systems, a deeper understanding of their components and operation is
required (Lesgold & Lajoie, 1991).
Novices often have difficulties integrating general theoretical concepts with
their intuitions because of conflicts between everyday meanings of new concepts
(e.g., acceleration, mass) and their meaning in theory (Reif, 1987), conflicts
between students' intuitive knowledge and theoretical laws (diSessa, 1982), or
because of the lack of procedural knowledge of solving specific problems that isoften not explicitly taught (Heller & Reif, 1984).
There have been two major approaches in using the results of cognitive
research on knowledge structures in the design of instructional systems (Glaser,
1990). The first approach has been developed in the tradition of knowledge
engineering in artificial intelligence and design of expert systems. It requires
exposing the learner to the knowledge characteristics of well- developed
expertise. The well-known example of a computer-based instructional system
designed in accordance with this approach is the GUIDON project (Clancey &
Letsinger, 1984).The second approach has been developed in cognitive science and is based on
cognitive models of students' knowledge. For example, in instructional systems
-
8/18/2019 Cognitive Load Factors.pdf
39/116
Slava Kalyuga30
based on qualitative models (Chi, 1988; Forbus & Gentner, 1986), a learner has to
progress from simple to more sophisticated domain-specific conceptual models(e.g., coordinated functional, causal, and structural models; qualitative and
quantitative models). This progression occurs in the context of solving
specifically designed problems with gradually increasing levels of complexity. An
example of this approach is the program for teaching troubleshooting of electric
circuits QUEST (White & Frederiksen, 1986).
Similar ideas were realized in the STEAMER project (the simulator for
training engineers to operate steam propulsion plants aboard large naval ships).
The primary goal was to teach a robust conceptual model (rather than specific
procedures) that could be used to reason about the steam plant qualitatively
(Holland, Hutchins, McCandless, Rosenstein, & Weitzman, 1987). Abstract
graphic images of the steam plant were organized in a hierarchical manner with
the major plant parameters presented first, followed by more detailed simulations
of subsystem components.
SHERLOCK is an example of a coached-practice learning environment in
which learners compare their own performance with expert performance (Gabrys,
Weiner, & Lesgold, 1993; Lesgold and Lajoie, 1991). Such reflection, however,
may place a large demand on working memory, if solution paths are long orcomplicated. SHERLOCK supports reflection by a replay of the trainee's and an
expert's performance. During replay, the system provides a summary of the
information the user has obtained on previous steps. The system allows learners to
observe the expert's decision process, reasons behind it, and the overall goal
structure for the expert performance. This technique reduces the cognitive load
associated with remembering the details of trainee's own performance while
observing the expert's actions (Gabrys et al., 1993).
Another well-known example of a similar approach is the model-tracing
methodology in intelligent tutoring systems (Anderson, 1993). The tutoringsystem simulates a student’s cognitive behavior in real time and maintains a
model of the student's knowledge state. It provides an example-based learning
environment in which students can induce rules from examples. The learner's
actual performance is compared to the ideal structure of solution (production rules
model), and the student is kept on the correct solution path. The tutor estimates
the availability of acquired productions based on their correct and incorrect
applications and selects appropriate problems for exercises. Many tutoring
programs based on the model-tracing methodology have been effectively used in
the fields of programming, geometry proofs, solving algebraic equations(Anderson, Boyle, & Reiser, 1985; Anderson & Corbett, 1993; Anderson,
-
8/18/2019 Cognitive Load Factors.pdf
40/116
Cognitive Studies of Expert-Novice Differences and Design of Instruction 31
Corbett, Fincham, Hoffman, & Pelletier, 1992; Anderson, Farrell, & Sauers,
1984).
COGNITIVE MODELS OF DEVELOPMENT OF EXPERTISE
AND INSTRUCTIONAL DESIGN
Cognitive studies of human performance and learning have demonstrated that
learning processes are supported by a basic cognitive architecture that includes a
powerful long-term memory and a limited working memory. Schema acquisition
and automation as the major learning mechanisms are critical in intellectual skills
formation. Studies of chess skills and other domains indicate that our knowledge
base provides the foundation of intellectual skills. Schemas held in long-term
memory allow experts to avoid processing overwhelming amounts of information
in working memory and thus by-pass working memory limitations.
Automatic processin