Nozawa thesis

6

Click here to load reader

Transcript of Nozawa thesis

Page 1: Nozawa thesis

1

University of Aizu, Graduation Thesis. March, 2012 s1170033

Abstract User manuals for physical performance help us

understand how a task is actually performed in a 3-d space. Literature on spatial information comprehension is scant on the topic related to identifying factors which leads spatial comprehension of physical tasks. The literature on mental imagery and rotation has been discussed in this context of an experiment where body rotations, object height and action combinations have been studied to understand how mental rotation tasks are performed. The experiment reported in this thesis focused on matching body rotation-action-object height combinations shown from body height with overhead images. Two types of activities were used: holding a bat and swinging a bat. Five body rotations from full front to back views were used with the bat being held at chest and waist height. Results show that canonical viewpoints and angles across the display plane are somewhat preferred, although accuracy with non-canonical viewpoints and angles into the display plane were also high. The study thus goes on to show that with more practice and time spent, mental rotation tasks could be better performed.

1 Introduction Mental imagery is an experience and an important

aspect of our general understanding of how different

objects functions in space without direct visualization

[1]. In a complex spatial world, mental imagery can

present some complex cases of comprehension

involving mental rotation. Mental rotation is the

ability to rotate two-dimensional and three-

dimensional objects in space but as an internal

representation of the mind. It is basically about how

the brain moves objects in the physical space in a

manner that helps with spatial understanding and

intelligence (including structural and functional

attributes) of objects in space [2][3][4].

Research in psychology has provided enough

literature demonstrating how people develop and

customize mental models and perform mental

rotation towards performing procedural actions in

space. This is where technical illustrations can

actually help develop guidelines in a way that might

help users perform mental rotations in a predefined or

expected sequence. This leads us to the question of

why we need technical illustrations for

communicating visually complex information.

Technical illustration is the use of illustration to

visually communicate information of complex

information [5]. The main purpose of any technical

illustration is to create expressive images, which has

meaning to the human senses and observer. The

accuracy of technical illustrations in terms of

dimensions and proportions help readers with visual

comprehension of the structural and functional

aspects of a given object in space. Naturally, this is

very important for showing body positions.

In this context, one must introduce the concept of

kinesthetic learning. It is a learning style whereby the

performer learns by carrying out a physical activity in

actual physical space, rather than thinking and

conjecturing about a physical action. The theory of

multiple intelligence by Gardner [6] has mentioned

kinesthetic learning. Kinesthetic learners are thought

to be the ones who prefer to physically try out and

perform the action involving their own bodily

experience.

Technical illustrations are designed to act as visual

aids that help replicate physical actions in a way

intended by instructors of the act. Technical

illustration show actions from the point of view of

performers, especially if performers’ bodies are

required to be positioned a particular way to perform

actions. One can choose to understand complex

information by using technical illustrations as an aid.

2 Review of the Literature 2.1 Overview The technical communication literature is not very

rich with studies other than by Krull et al., (2001;

2003; 2004); (Szlichcinski, 1980); (Heiser and

Tversky, 2002) and few others, focusing primarily on

comprehension of procedural illustrations, and less

for body positions in space [7][8][9][10][11].

Traditionally, studies of mental imagery and rotation

in experimental psychology have addressed this issue

of object positions in space, but for comprehending

Efficacy of Technical Illustrations in a Technical

Communication Environment Masato Nozawa s1170033 Supervised by Prof. Debopriyo Roy

Page 2: Nozawa thesis

2

University of Aizu, Graduation Thesis. March, 2012 s1170033

human ability to perform mental rotation tasks in

space. This review of the literature is designed to

explain two major factors that might help

comprehend physical actions and performing mental

rotation in a 3-d environment.

● How do perception of depth and body/object-

centered viewpoints influence comprehension of

physical actions in a 3-d space?

● How do motor skills and learning influence the

way we design technical illustrations?

Such understanding will help us comprehend what it

takes to design technical illustrations of physical

actions performed in a 3-d environment. What factors

should technical illustrators consider for designing

user actions in space?

2.2 PERCEPTION OF DEPTH 2-D TECHNICAL ILLUSTRATION

There is extensive research done by Krull et al.,

(2004) with the suggestion that graphics for physical

tasks need to take into account the needs of users who

will carry out actions in a physical environment [9].

Research suggests that graphics need to show tasks

from the users' viewpoint, and need to make clear

how tools are to be used and the direction in which

actions are to be exerted. The paper provides some

sample graphic design guidelines.

Technical illustrations are useful only when readers

are actually able to use their vision systems when

performing tasks in the three dimensional space [5].

Readers might scan through physical illustrations

showing physical actions and use one type of vision

system primarily for object identification purposes,

while the other type of vision system could be used

for orienting their bodies in space [8]. However,

people are often able to comprehend well the distance

between objects or body parts across the display

plane where the space between objects is visible.

Contrarily, people find it relatively difficult to be able

to judge object positions when distances are to be

judged into the display plane [8].

The problem is that while showing different variants

of body positions and physical actions as in sports,

people often perceive positions, objects, movements,

and forces along the line of sight into the display

plane, thereby obscuring the vision necessary to

comprehend or copy the action, exactly as it should

be executed.

Research suggests that monocular vision dominates

binocular vision in experiencing depth from 2D

pictures and speculated that binocular vision did not

develop as a separate visual system but as an add-on

to monocular vision [12]. Hochberg (1978) suggested

that readers of 2-D illustrations on print or electronic

media have only monocular vision to help them

interpret what they see [13]. Krull concluded that

monocular cues reduces depth perception for 2-D

illustrations, thereby making interpretation more

difficult, and situates the choice of illustration’s

perspective (body-centered vs. object-centered) as a

central consideration. Depth perception arises from a

variety of depth cues. These are typically classified

into binocular cues that require input from both eyes

and monocular cues that require the input from just

one eye [14].

So, an optimal illustration should technically always

help readers see the maximum viewpoints available

in the scene, and show objects in a way such that

almost no parts of the body in business is obscured in

a way that handicaps the possible understanding and

execution of the task. This is what we call an object-

centered view where objects are placed across the

display plane.

2.3 USER-CENTERED VS.

OBJECT-CENTERED

PERSPECTIVE An illustration with an object-centered point of view

positions objects across a display plane. This

viewpoint, which could also be called a spectator’s

view, allows objects to be placed so as to direct

viewers’ attention without obscuring important parts

of objects [6].

When we have to show a man pushing a cart, should

we show the scene where we see the back of the man

pushing the cart? Although it goes well with the user-

centered perspective, the cart will be obscured from

direct viewing; neither would it be possible to gauge

the hand placement as to replicate exactly how the

cart is being pushed. However, if a 1/3rd front or a

1/3rd side vision is shown, at a waist length, it might

be a lot easier to see most of the body and parts of the

cart, including the hand placement of the man

pushing the cart. Research by Heiser and Tversky

(2002) with a furniture assembly task and

Szlichcinski (1979; 1980) with hand positions

supported the efficacy of partially rotated objects as

compared to objects shown head-on, or shown with

full back [11][10].

Psychological research [8] has concurred that

canonical views showing two-dimensional

representations of physical actions that are held in a

three dimensional world are best represented when

illustrations are shown with objects in a three-quarter

view from slightly below the camera position.

Page 3: Nozawa thesis

3

University of Aizu, Graduation Thesis. March, 2012 s1170033

Although canonical views (slightly rotated viewpoint

to show maximum angles) are always preferred,

when it comes to replicating tasks, the choice

between a spectator’s viewpoint (seeing the action as

an observer and not as a doer) and object-centered

viewpoint (seeing the action as a doer and not as an

observer) is rather obscure and more context-driven.

If the question is to judge the distance between legs

when pushing a heavy cart, then a complete side view

might be the most preferred option. However, if we

need to see the grip and arm movements (stretching)

when pushing the cart, both side and zoomed-in front

views might both be effective. This is important to

understand because there are individual differences in

the way people prioritize objects in space vis-a-vis

the orientation of their bodies in space and with

different interpretations of visual information [15]

and with different performance levels on the task [16].

2.4 UNDERSTANDING MOTOR

SKILLS FOR TECHNICAL

ILLUSTRATIONS DESIGN While designing illustrations of physical actions in a

user manual, technical illustrators should consider

two important things.

●How is motor learning and performance

developed?

● What are the best possible strategies for drawing

technical illustrations (for different tasks) such that it

helps readers understand the physical actions, not

only what needs to be accomplished, but exactly how

it needs to be done?

Skills classified by task: A specific task, based on

specific skills could be classified in terms of how

well defined is the movement in a discrete, serial and

continuous continuum.

Skills classified by Cognitive Elements: While

netting the ball in a basketball game, there should be

cognitive strategies deciding on the precise nature of

jump and the throw (how much to jump and the

distance to throw). Perfecting the jump and throw to a

certain level of efficiency is recognition of fine motor

skill, and the strategy behind such efficiency is

cognitive skill, and the combination leads to the

constant adaptation needed to reach a certain level of

efficiency.

Skills Classified by Environmental Factors: With

more environmental conditions and related

unpredictability, the levels of cognitive skills might

have more impact [17]. For example, when playing

baseball, how to swing the bat to hit the ball when the

ball swings in the air due to windy conditions is a

valid consideration.

3 MAJOR RESEARCH QUESTION

AND HYPOTHESES What might be the most optimal viewpoint towards

comprehending a two-dimensional illustration

showing physical actions in a three-dimensional

space?

Hypothesis:

● Objects shown from a performer’s point of view

should be easier to understand.

● Illustrations showing more angles across the

display plane might be easier to understand.

● Levels of comprehension based on a two-

dimensional illustration should differ based on

whether the objects are shown at or below the camera

position.

The purpose of this experiment as designed for the

reported study is not to measure motor skills and

performance, but to identify it as a factor influencing

performance and learning, and most importantly to

explore the extent to which readers are able to

comprehend illustrations when demonstrated in a

print media from different perspectives and depth

perceptions.

Sample and Context: Forty-one students who are

non-native speakers of English (native Japanese

speakers) participated in this study.

Procedure

The experiment aims to understand how common

people understand images and relates them to images

shown from different perspectives and camera

positions. We asked test subjects to evaluate body

images via matching tasks and asked them to rate

their confidence in their choices.

4 Method 41 subjects took part in the experiment and each

subject rated 40 image types, divided into two blocks

of 20 each. As part of its robust design, the

experiment considered two sets of images. For the

experiment, we generated images of body positions

for two kind activities: a man holding a bat and a man

hitting with a bat. The purpose for using two different

types of objects relates to the exploration of whether

object types influence how decisions about depth

perceptions and display planes and viewpoints

(object-centered or performer-centered) are made.

This paper only discusses the results generated from

the image set related to the man with the bat. The

Page 4: Nozawa thesis

4

University of Aizu, Graduation Thesis. March, 2012 s1170033

other set (man with ball) has been discussed as part of

another paper.

Each participant was handed out two different sets as

part of an in-class graded assignment, with each set

having 20 test sheets. Each participant was first

handed out an instruction sheet in Japanese, and they

were orally explained in Japanese as to what is

expected of them from the experiment.

The volunteers explained to them the purpose of

the experiment, what it aims to achieve and how each

participant should approach the test. At that point, the

participants were allowed to ask questions related to

the experiment, and voice any question or concern.

The volunteers were also available throughout the

experiment to answer queries related to the

experiment. There was no time limit set for the

participants to complete the experiment, but they

were expected to complete their responses within 90

minutes. However, they were allowed to retain the

answer sheet with them until the next class meeting

exactly a week later. There were two reasons why

there was no time limit maintained.

(1) Students were allowed to think and re-think

about illustrations and were allowed to change their

responses if they wanted to.

(2) Students had to complete a series of questions

related to the experiment in Moodle as a graded

assignment, and retaining the test sheets and

referring back to those while answering the

questions in Moodle were naturally thought of as

more enriching.

In each test sheet, participants were asked to circle

the correct choice. Each of the three options were

demonstrated as Picture A, B and C. They also went

on report their second best choice for each test sheet

and also their levels of confidence for each response.

Instruments: Using a computer program called POSER Figure

Artist that sustains accurate three-dimensional

relationships among body parts, the experimenter

produced variations of viewpoints and body positions.

Each position included two heights for each activity:

Chest and Waist. The man-with-bat-holding is shown

as holding a bat centered in front of chest or waist with

the hands gripping the bat from both sides. The man-

with-bat-throwing version shows hitting with bat at the

chest or waist height. These action gestures were

captured for five positions where the body moves with

the camera position remaining constant: Front - 0

degrees (the man holding/throwing the ball and facing

the camera head on), 1/3 Side - 30 degrees, Side - 90

degrees, 1/3 Back - 120 degrees, Back - 180 degrees.

For all these images, the camera was positioned

slightly above the waist height.

Each set had five images and there were 4 sets in total.

Every set was rotated in five angles as mentioned

before. The first set showed a man holding a bat at

chest height; the second set showed hitting at chest

height. Two other sets showed a man holding and

hitting with a bat at waist height respectively.

Once these images were generated, the camera was

then positioned to capture images from the top for the

above-mentioned sets. A matching top image was

generated for each image generated from the sets

above, with a displacement along the y and z-axis to

position the camera exactly on top of the head. Each of

the images generated for the 4 sets were tested to see

whether readers could identify the same when shown

from the top. Each test sheet had an image from the

above sets, with three top views out of which only one

top view correctly represents the view shown from

slightly over the waist height. Each test sheet had three

questions and question 1 and 3 were answered in a

Likert scale.

1 Identify the most appropriate picture shown from the

top that matches the picture shown from the waist

height. (Three options provided).

2 Which illustration shown from the top stands the

second best?

3 How confident are you about your response?

Findings: A comprehensive review of data allows us to see

that there is some significant difference between the 20

different body position-height-action combinations

that were used for this analysis. Subsequent analysis

revealed whether the difference in the mean values of

accuracy between body positions, as has been

discussed in the next paragraph is statistically

significant.

Data shows relatively highest mean values for man

holding bat at chest height for 1/3rd side rotation at .93

(meaning 93% of the participants completed the

matching task accurately), holding waist 1/3rd

back,

holding waist back and front at over .90 mean score.

Only one score from hitting category, hitting waist

back rotation is marked at .93. Interestingly, almost all

the highest levels of accuracy for any given matching

task are recorded for holding bat positions; with hitting

positions for any angle (except hitting waist back

rotation) have lower levels of accuracy.

Page 5: Nozawa thesis

5

University of Aizu, Graduation Thesis. March, 2012 s1170033

Further, data shows that all the highest frequencies

are recorded for 1/3rd

side, back, 1/3rd back positions.

The lowest mean accuracy scores were recorded for

hitting chest 1/3rd back positions at .66, hitting waist

1/3rd

side positions at .68 and other hitting positions

also recording lower mean accuracy scores. With over

80% accuracy scores, frequency data shows over 30

individuals performing the matching task correctly.

I then performed a non-parametric Cochran’s Q test

for binary data (0 = inaccurate; 1 = accurate) for the 5

angular rotations at “holding bat at chest height”.

Results show that with Cochran’s Q value at 4.182 and

p = .382 > .05, there is no significant difference

between the different matching tasks at holding chest

height for 5 angular rotations.

An overall Cochran’s Q test for all the 20

combinations of data (5 angular rotations, height and

bat action) shows a value of - 17.968, with Asymp. Sig

= .525.

For hitting at chest position for 5 angular rotations,

data shows a value of 6.552 with Asymp. Sig = .162.

Although data shows statistically insignificant

difference between the 5 matching tasks in the group,

it certainly shows more diverse data when compared to

the “holding chest” group.

Data show that for “holding bat at the waist height”

combinations for the 5 angular rotations there is

insignificant difference between the mean accuracy

scores. A Cochran’s value of 2.615 and p = .624 goes

on to show the insignificance. However, data does

indicate that the accuracy performance is less varied

for “holding-waist” group than it is for “hitting-chest”

group.

Data further shows that there are statistically

significant difference accuracy scores between the 5

angular body positions for the 5 “hitting-waist”

combinations. With a Cochran’s test value of 14.122

and a p value = .007 < .05, we see that angular

rotations for hitting waist positions did not call for the

same type of accuracy scores. This evidence shows

that when compared to all other groups, the data is

more varied between these 5 matching tasks.

Comparative accuracy scores between four front

positions between the chest and waist heights and for

actions (holding or hitting) show a difference in mean

accuracy scores between 78 ~ 90%. The comparative

accuracy scores between the 1/3rd

side positions

between the chest and waist heights and for actions

(holding or hitting) show a difference in mean

accuracy scores between 68 ~ 93%. Results suggest

that hitting waist 1/3rd

side with 68% accuracy was

way lower than any other position combination

discussed so far. The other hitting position at chest

height had much more accuracy at 83%. But overall, it

looks like the holding positions were relatively easier

to complete. Results suggest that hitting positions on

side angles has relatively lower levels of accuracy

around 76 ~ 78%, but the holding positions (chest and

waist) have higher accuracy scores at 85%.

For the confidence self-reports on the 5 angular

rotations for the holding chest positions, we see a

variation in self-confidence levels in a 1 ~ 5 scale

between 3.50 ~ 3.98. Interestingly, front and 1/3rd side

positions clearly show higher levels of comfort and

confidence.

Interestingly, for hitting chest positions we clearly see

a lowered confidence level around the 3.5 levels for all

the given angular rotations. Even when the confidence

levels are lower when compared to the holding chest

positions, even within this hitting-chest group we see

quite a difference in confidence levels between front

position at 3.43 and 1/3rd back position at 3.63.

Surprisingly, we see higher confidence levels for 1/3rd

back positions, whereas for back position, the

confidence is quite lower.

This data is not conclusive and indicative of any

pattern, but there exists some indications that

canonical viewpoints show a strong correlation

between actual accuracy scores and confidence.

4 Discussions In the review of the literature, we had a section on

how motor skills and related performance happens

for physical tasks. We wanted readers to be conscious

of the fact that types of actions shown (discrete, serial

or continuous), cognitive information processing by

the actor, linking motor skills and cognitive elements,

and environmental factors for e.g., consideration of

wind factors (while bowling in a game of cricket),

ground slope (when playing golf) etc., technically

and practically have a bearing on how the physical

task is completed. However, in this study we could

not consider it to be a factor that influences

understanding of illustrations. Rather, we wanted

readers to know that it becomes a factor when readers

probably try to emulate the action based on their

comprehension of the illustration. Comprehension of

how the task is to be completed and actual

implementation of the task are different factors and

readers should be aware of the fact that actual

implementation needs more calculation and judgment

based on specific context of action which probably

can’t always be designed as part of technical

illustrations. Motor learning is based on motor

Page 6: Nozawa thesis

6

University of Aizu, Graduation Thesis. March, 2012 s1170033

performance and is different from learning about an

action from a technical illustration with visible

viewpoints. Technical illustrations will work for

initial comprehension of action patterns, but beyond

that, motor learning and performance should

complement each other.

5 Conclusions This study is aimed at carrying forward the studies

performed by Krull et al (2004) [10]. As compared to

previous studies by Krull et al., this study aimed at

including more variations in actions and body heights

and making those positional features more explicit

and detailed. Further, with this study the aim was to

include a serious group of participants who actually

participated in this exercise for a grade. Future

studies should continue to include more variations in

body height and action types, with more details and

objects in and across the line of sight. This study does

allow us to see the importance of different variables

and how it influences performance. More testing is

needed before we could definitely reach a conclusion

about the preferences that readers might have for

visualization purposes. Finally, besides testing with

different variations on body height – action

combinations, future testing should also make

alterations to the way the current experiment has been

designed to more systematically include more options

for test sheets.

Reference [1] Michel-Ange Amorim et al., “Embodied Spatial Transformations: Body Analogy for the Mental Rotation of Objects” American Psychological Association, Vol. 135, no. 3, pp. 327-347, 2006.

[2] Johnson A.M., “The speed of mental rotation as a function of problem-solving strategies.” Perception and Motor Skills, Vol. 71, no. 3, pp.803-806, Dec.1990.

[3] Jones B et al., “Effects of sex, handedness, stimulus and visual field on “mental rotation”.” Cortex, Vol. 18, no. 4, pp. 501-514, Dec. 1982.

[4] Hertzog C., “Age differences in components of mental-rotation task performance.” Bulletin of the Psychonomic Society, Vol. 29, no. 3, pp. 209-212, May. 1991 [5] Viola I., Kanitsar A., Groller M.E., “Importance-driven feature enhancement in volume visualization” IEEE, Vol.11, No.4, pp.408-418, July-Aug, 2005.

[6] H. Gardner, Frames of Mind: The Theory of Multiple Intelligence New York: Basic Books, 1983 [7] Krull R., “Writing for Bodies in Space” Proceedings of the IEEE Professional Communication Society, September, 2001. [8] Robert Krull, Debopriyo Roy, Shreyas D’Souza, Marilyn Morgan, “User Perceptions and Point of View in Technical Illustration s”, STC Proceedings, 2003. [9] Robert Krull, Shereyas J. D'souza, Debopriyo Roy, AND D. Michael Sharp, “Designing Procedural Illustrations” IEEE TRANSACTIONS ON PROFESSIONAL COMMUNICATION, VOL. 47, NO. 1, MARCH 2004

[10] Szlichcinski, “The syntax of pictorial instructions” In P.A. Lolers, M.E Wrolstad, and H. Bouma(Eds.) Processing of Visible Language, Vol. 2, pp. 113-124. 1980

[11] Heiser J. and B. Tversky, “Diagrams and Descriptions in Acquiring Complex Systems.” Proceedings of the 24

th Annual Meeting of the

Cognitive Science Society, Fairfax, VA, August, 2002. [12] C.J. Erkelens “Interaction of monocular and

binocular vision” Perception 39 ECVP Abstract Supplement. 2010.

[13] Kenneth J., Hochberg, ”A SIGNED MEASURE

ON PATH SPACE RELATED TO WIENER MEASURE” The Annals of Probability, Vol.6, No.3, Jun 1978.

[14] H. Goldstein, “Communication Intervention for Children” Journal of Autism Developmental Disorders, Vol. 32, No. 5, October, 2002.

[15] A. David Milner and Melvyn A. Goodale, “The visual Brain in Action” Great Clarendon Street, Oxford OX2 6DP, 1995.

[16] Zacks et al., “Mental Spatial Transformations of Objects and Perspective.” Spatial Cognition & Computation, pp. 315-322, 2002.

[17] Schmidt Richard, & Wrisberg Craig, “Motor

Learning and Performance” Human Kinetics

Publishers, United States, 2008.