Evaluation of solutions for a vir- tual reality cockpit in ...1324173/FULLTEXT01.pdf1.2. Aim This...

Linköpings universitetSE–581 83 Linköping+46 13 28 10 00 , www.liu.se

Linköping University | Department of Computer and Information Science

Master’s thesis, 30 ECTS | Computer Science and Engineering

2019 | LIU-IDA/LITH-EX-A--19/022--SE

Evaluation of solutions for a vir-tual reality cockpit in fighter jetsimulationUtvärdering av lösningar för virtuell cockpit inom flygsimulering

Tobias Martinsson

Supervisor : Sahand SadjadeeExaminer : Erik Berglund

External supervisor : Stefan Furenbäck

http://www.liu.se

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer-ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior förenskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Över-föring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användningav dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och till-gängligheten finns lösningar av teknisk och administrativ art.Upphovsmannens ideella rätt innefattar rätt att bli nämnd somupphovsman i den omfattning som godsed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet än-dras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannenslitterära eller konstnärliga anseende eller egenart.För ytterligare information om Linköping University Electronic Press se förlagets hemsidahttp://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for aperiod of 25 years starting from the date of publication barring exceptional circumstances.The online availability of the document implies permanent permission for anyone to read, to down-load, or to print out single copies for his/hers own use and to use it unchanged for non-commercialresearch and educational purpose. Subsequent transfers of copyright cannot revoke this permission.All other uses of the document are conditional upon the consent of the copyright owner. The publisherhas taken technical and administrative measures to assure authenticity, security and accessibility.According to intellectual property law the author has the right to be mentioned when his/her work isaccessed as described above and to be protected against infringement.For additional information about the Linköping University Electronic Press and its proceduresfor publication and for assurance of document integrity, please refer to its www home page:http://www.ep.liu.se/.

© Tobias Martinsson

http://www.ep.liu.se/http://www.ep.liu.se/

Abstract

Virtual Reality has become widespread in areas other than gaming. How this type of tech-nology can be used in, for example, flight simulation still needs to be discovered. In thisthesis virtual reality technology and free-hand interactions are examined in the context of afighter jet cockpit. Design principles and visualization techniques are used to examine howa virtual reality cockpit and interactions can be designed with high usability. From user testsessions, and accompanying questionnaire, some guidelines for how this type of interac-tion should be designed are gathered. Specifically, how objects that can be interacted with,and the distance to them should be visualized, with regards to free-hand interaction. Also,different ways of providing feedback to the user are discussed. Finally, it is determinedthat the technology used is a good fit for the context and task, but the implementation ofinteraction components needs more work. Alternative ways of tracking hand motions andother configurations for the sensors should be examined in the same context.

Acknowledgments

I would like to thank my supervisor Stefan for all of the help and the support during thethesis project. I would also like to thank my examiner Erik for his valuable feedback. Finally,I would like to thank my opponent Petter Granli for his notes and comments on the thesis.

iv

Contents

Abstract iii

Acknowledgments iv

Contents v

List of Figures vii

List of Tables viii

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Theory 32.1 Leap Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Related Work(or Guidelines for Interaction Design and Visualization) . . . . . . 42.3 Usability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3.1 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3.2 Operability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4 Usability Testing and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4.1 Evaluating the Leap Motion Controller with Virtual Reality . . . . . . . 62.4.2 Post-test Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4.3 System Usability Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4.4 Post-study System Usability Questionnaire . . . . . . . . . . . . . . . . . 72.4.5 User Experience Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . 82.4.6 Simulator Sickness Questionnaire . . . . . . . . . . . . . . . . . . . . . . 8

3 Method 93.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1.1 The Unity Game Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.1.2 Interaction and the Leap Motion Controller . . . . . . . . . . . . . . . . . 103.1.3 Implementing Virtual Reality Support . . . . . . . . . . . . . . . . . . . . 10

3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2.1 Tasks and User Test Session . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2.2 Measuring Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.3 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Results 144.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.1.1 Interaction Components and Leap Motion . . . . . . . . . . . . . . . . . 144.1.2 Implementation of Flight Controls . . . . . . . . . . . . . . . . . . . . . . 17

v

4.1.3 Virtual Reality Implementation . . . . . . . . . . . . . . . . . . . . . . . . 174.1.4 The Final Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2.1 Questionnaire Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2.2 User tasks and execution time . . . . . . . . . . . . . . . . . . . . . . . . . 204.2.3 Observations and think-aloud feedback . . . . . . . . . . . . . . . . . . . 20

5 Discussion 225.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1.1 Analysis of VRSQ answers . . . . . . . . . . . . . . . . . . . . . . . . . . 225.1.2 Post-test Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.1.3 User task result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.2 User Test and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.2.3 Choice of Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.3 The work in a wider context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Conclusion 276.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Bibliography 30

A Questionnaire 33

vi

List of Figures

2.1 Leap Motion Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

4.1 The interaction components in the tutorial environment . . . . . . . . . . . . . . . . 154.2 The second version of the knob component . . . . . . . . . . . . . . . . . . . . . . . 164.3 Overview of the cockpit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.4 Average of Post-test questionnaire answers . . . . . . . . . . . . . . . . . . . . . . . 19

vii

List of Tables

4.1 Previous experience per user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2 Difference between post- and before-test VRSQ answers . . . . . . . . . . . . . . . 194.3 User task results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

viii

1 Introduction

The field of virtual reality is undergoing rapid improvements, has gained more attention andhas grown in recent years [1]. The main application of virtual reality, so far, has been in gam-ing and game related areas [1]. However, entertainment is not the only place where virtualreality has found its place, other areas such as medicine have used virtual and augmented re-ality in training. The aviation field has also begun to study and use virtual reality for differentscenarios, but mainly training.[2]

Virtual reality, henceforth VR, as it is used here, is well defined by Feng et al. [3] as usingcomputers, headsets, and sometimes physical rooms to create a realistic world. In the bookVirtual Reality Technology [4], the definition is a little bit different, with the addition that theuser can interact with the simulation in real time. The headsets or goggles used in modernVR experiences are so-called head mounted displays(HMDs), VIVE1 and Oculus Rift2 aretwo examples. The interaction with these modern VR technologies consists primarily of headtracking and handheld controllers.

1.1 Motivation

As mentioned, VR has grown and evolved recently and gained ground in new markets. Whatstill needs to be discovered are the limitations and possibilities with the technology in areasother than gaming. It could be useful, for example, in flight training and simulation. Anexample of what a non-VR simulation might look like can be found at an exhibition at the airforce museum in Linköping 3. This simulation consists of three large screens and a simplifiedcockpit. The problem with that type of simulation might be that it is large, expensive, andhard to set up. With VR, we get a simpler solution, that will probably not replace existingsimulators but might be used for other tasks.

1https://www.vive.com2https://www.oculus.com3https://www.flygvapenmuseum.se/besok-oss/jas-39-gripensimulator/

1

1.2. Aim

This study presented in this thesis was performed in collaboration with the company Saab4.Saab is a defense company that develops air defenses, including fighter jets, among otherproducts. In the process of developing these airplanes, simulations are used for training anddevelopment purposes.

A previous study has been conducted at Saab, that examined how VR technology can be usedas a complement for flight simulation. The previous study implemented a VR experiencein a commercial flight simulator. The user mainly interacted with the simulation using ajoystick in a hands-on throttle and stick (HOTAS) setup. It was determined that this couldbe somewhat limiting because the pilot may sometimes also need to interact with panels,buttons, and switches in the cockpit.

The technology used in this study is relatively new. This makes it interesting to see if newerHMDs and accompanying interaction technologies can provide the user with enough preci-sion in their interaction so that it might serve as a good complement to the three screens anda physical cockpit simulation.

1.2 Aim

The purpose of this study is to determine if it is possible to implement a VR simulation of acockpit so that the instruments are readable and usable, in the context of flight training. Inaddition, what type of interaction that the user can reliably perform with their hands shouldbe determined. The physical hand interaction will be conducted with the Leap Motion5 tech-nology.

The goal of this project is to produce a prototype of a fighter jet cockpit in VR in which theuser can interact with the simulation with their hands. The user should be able to fly usinga joystick and throttle. A VR headset should be used as a display, and the Leap MotionController should be used to enable free-hand interaction.

1.3 Research Questions

Alonso-Rios et al. define efficiency, in the context of this thesis, as a measurement of executiontime and human effort. Further, they define operability, as a measurement of completenessand precision of a system. [5]

• How can a virtual reality simulation for a plane cockpit be designed so that user inter-actions have high usability in terms of user efficiency and operability?

4https://saab.com/5https://www.leapmotion.com/

2

2 Theory

In this chapter, relevant theory will be presented. This includes definitions and descriptionsof evaluation and technology used in the study.

2.1 Leap Motion

The Leap Motion(Figure 2.1) is a sensor and controller for interacting with hands. The usercan use their bare hands to interact and no other peripherals need to be attached. Leap Motionhas development tools available that integrate with virtual reality headsets. [6]

Figure 2.1: Leap Motion Controller, available from: https://www.leapmotion.com/press/

The Leap Motion controller(LMC) API gives information regarding position and movementof arm, hands, and fingers [6]. In a study [7] reviewing natural user interfaces and the LMC itis determined that the LMC can be very useful in certain situations. Among these are appli-cations where the hand motions in the real world are mapped directly to the same motionsin the virtual, also some medical training situations. However, there are some problems with

3

2.2. Related Work(or Guidelines for Interaction Design and Visualization)

the LMC. According to the authors, the detection of hands can be problematic in not well-litsituations. Problems regarding detection also occur when hands and fingers are overlapping.

A common consensus regarding the LMC is that it is not a good replacement for typical inter-action methods, such as mouse and keyboard, for interacting in typical ways[7, 8]. In directcomparison to a mouse for pointing and clicking tasks, the LMC has a much higher error rate[7]. Even though the LMC does not outperform a mouse for these types of interactions, noth-ing suggests that the LMC could not have high usability with a properly designed interface,developed specifically for free-hand interaction.

2.2 Related Work(or Guidelines for Interaction Design and Visualization)

Nanjappan et al. discuss, in their article “User-elicited dual-hand interactions for manipulat-ing 3D objects in virtual reality environments”, the challenges of hand interactions in a virtualenvironment. However, data gloves arguably being the truest hand gesture input, they ex-plore suitable interactions for manipulation 3D objects using dual-hand controllers, which isbecoming standard across major HMD virtual reality solutions. [9]

In the study by Nanjappan et al., test subjects were asked to perform a series of movements toachieve different tasks within the virtual environment. Later the movements were evaluated,and the following conclusions were drawn:

• Users prefer one-hand interaction

• If two hands are required, users prefer to alternate between hands instead of simulta-neous hand motion

• Users prefer shoulder motions for interaction

Waraporn Viyanon and Setta Sasananan evaluate navigation using Leap Motion controller inan interior design application in their paper “Usability and Performance of the Leap Motioncontroller and Oculus Rift for Interior Decoration.” They note that the best way to evaluate aninterface is to observe people using it and that is how they evaluated their application. Theresult of the evaluation concludes that the Leap Motion controllers are not accurate enoughfor detecting motion of hand movement. [10]

In a study by Vosinakis and Koutsabasis, usability and visualization, with a combinationof Leap Motion and Oculus Rift, were examined. They studied four different techniquesfor visualizing proximity to objects from the user’s hand. The motivation behind why thistype of visualization might be needed is that users have experienced depth problems wheninteracting with the Leap Motion controller. [11]

In Vosinakis and Koutsabasis’ test, they had users try and grasp an object and place it insideanother object. This was performed with and without VR, but both times with a Leap Motioncontroller. They used four different ways of visualizing how close the user’s hand was tothe object. These four were: object coloring, connecting line, object halo, and shadows. Theconnecting line visualization consists of a line from the hand to the object and displays thedistance. The shadow method displays a shadow on the surface beneath the hand and object.Object coloring and object halo both use coloring to indicate how close the user is. The objectcoloring method colors the entire object, and the object halo method displays a colored haloaround the object. [11]

What Vosinakis and Koutsabasis’ experiment found was that the combination of VR and LeapMotion outperformed the desktop and Leap Motion configuration in all tests. Regarding the

4

2.3. Usability

visualization, they did not find much evidence to support which method of visualizing hadhigher usability. What the results did point to was that some sort of visualization is neededfor these types of tasks. The two techniques that gave better results were the object coloringand the halo methods. Object coloring gives better performance, but the object halo giveshigher user satisfaction. [11]

2.3 Usability

Alonso-Rios et al. attempt to define usability in a good way, in the article "Usability: A CriticalAnalysis and a Taxonomy". The goal of the research described in the text was to give a moreprecise and thorough definition of usability, than other definitions. The definitions suggestedstems from other definitions, but try to distill several into six main attributes. These attributesare knowability, operability, efficiency, robustness, safety, and subjective satisfaction. Further,these attributes are divided into subattributes. The goal of the division of attributes was tonot create any overlap between attributes, or create any ambiguity. The authors state in thearticle that, even though they have provided a comprehensive set of attributes, not all of themneed to be used. Depending on the system, some attributes might not be relevant. Finally,the authors emphasize the importance of defining the attributes in the context which they areused. [5]

2.3.1 Efficiency

Efficiency is defined by Alonso-Rios et al. as the system’s ability to give the expected result inresponse to resources used. Efficiency is divided into four subattributes, efficiency in: humaneconomic cost, tied up resources, task execution time, and human effort. Efficiency in humaneffort is the system’s capacity to give the desired result in return as a response to the humanphysical and/or mental effort. The subattribute efficiency in task execution time is ratherself-explanatory. The time measured is both the time for a user to perform a task and for thesystem to respond. [5]

2.3.2 Operability

The attribute operability is explained by Alonso-Rios et al. as the systems ability to give theuser the right, wanted functionality. This means that the system should operate as expectedfor the user. This attribute is also divided into subattributes, the two that are focused on inthis study is completeness and precision. Completeness is a measurement of the users’ abilityto perform desired tasks with the system. Precision is a measurement of the correctness ofthe tasks performed, how precise the task can be executed. [5]

5

2.4. Usability Testing and Evaluation

2.4 Usability Testing and Evaluation

To evaluate usability, there is a need for usability testing methods. Ghasemifard et al. desribesusability testing as "a process of through systematically collecting the usability data of inter-face and assessing and improving the data." In the paper "A New View at Usability TestMethods of Interfaces for Human Computer Interaction", they suggest and discuss severaldifferent methods for usability testing. The goal of usability testing is to facilitate developersin improving a system and making it more usable. Ghasemifard et al. use many of the samesources for usability definitions as Alonso-Rios et al. [5] based their taxonomy of usabilityon. [12]

In the paper, Ghasemifard et al. state that usability testing is a way of finding problem areasand possible guidelines for how to correct them. This way of testing is a way to see howintended users interact with the system. Three main activities in the evaluation are described:capture, analysis, and critique. Meaning, gather data, analyze data, and find a solution to theproblem. [12]

As mentioned previously, several methods for usability testing are described. Among theseare heuristic evaluation, cognitive walkthrough, remote testing, user-based testing, and thefocus group method. All of the different methods are evaluated on multiple criteria. Some ofthese criteria are cost of testing, requirements for carrying out a test, flexibility, and purposeof the method. User-based testing is determined to be a costly method but also a flexiblemethod. The method is deemed to have high resource requirements, but the purpose of itseems to be one of the more precise ones. The method’s purpose is "Measuring usabilityand interaction problems." User-based testing could be performed by having the user solvea task. The test administrator then records the completion time, if the task was completed,and potential errors made by the user. A minimum of 8 testers is recommended to get properresults from user-based testing. Worth mentioning is that the minimum number of testers isan estimate and has not been wholly examined or determined. [12]

2.4.1 Evaluating the Leap Motion Controller with Virtual Reality

Previously the LMC has been evaluated with regards to usability in many, similar ways.One study [13] had users perform tasks and answer a post-test questionnaire. During otherstudies, the test administrators have observed the tester while they are performing their tasksand then conducted post-test interviews [10].

In a study by Bachmann et al. [7], different ways of gathering data and evaluating interactionswith the LMC are suggested. For determining usability observations, interviews and ques-tionnaires are recommended. The authors also state that the System Usability Scale(SUS) [14]is a good way to measure usability. To get a general estimate of usability, the User ExperienceQuestionnaire(UEQ) [15] can be used. [7]

Another, earlier study by Bachmann et al. [13] tried to evaluate the LMC compared to amouse, in terms of efficiency. The users performed clicking tasks, similar to typical mousepoint-and-click actions. The authors use Fitts’ law to determine the efficiency. This can beused to calculate expected time for a movement in, for example, a point-and-click task.

A problem with using the LMC together virtual reality headsets is, that it can induce dizzinessand discomfort for inexperienced users [10]. It is common to try to determine mental strainduring usability studies [7, 13]. This is particularly important to do when working with VRsince so-called simulator sickness(section 2.4.6) is a very common occurrence [16, 17].

6


2.4.2 Post-test Questionnaire

In the book "Handbook of usability testing", it is stated that the goal with a post-test question-naire is to get information from the testers that cannot be perceived otherwise. The questionsare the same for all of the testers, and therefore, the answers can be compared to each other.The questions should be formulated so that the answers are subjective and non-observable bythe administrator. Also, they should stick to the subject, not diverge, and give as little roomfor interpretations as possible. The questionnaire should only include things that are neces-sary for the testers to answer and that relate to the research question. According to Chisnelland Rubin: "A good test of the relevance of a question is to ask yourself how the answer willmove you closer to a design decision." In the case of this thesis, each answer should bring uscloser to determining the usability of the prototype. In the book, the authors also state that itcould be advantageous to keep the number of questions low, or if the session is long, split itup into sections. If the testers are very tired from doing the tasks, they might not give reliableanswers. [18]

2.4.3 System Usability Scale

The system usability scale(SUS) is a standardized questionnaire consisting of 10 questions.The intended goal with the SUS was to create a simple, quick, and resource efficient way ofmeasuring overall usability. The users rate their agreement of the statements on a five-pointscale, 1-5. The 10 questions or statements are placed in such a way that odd questions arepositive(1,3,5,7,9), and even questions are negative(2,4,6,8,10). This creates a balance in theanswers. If, for example, a user chooses only 5 or only on 1, the outcome will not be positiveor negative. [14]

The SUS score is calculated by taking the score of the positive questions and subtracting 1.The negative sum is calculated by subtracting the score from 5. The sum of the resultingscores is then multiplied by 2.5 to create a SUS score scale from 0 to 100. [14]

SUS is a good questionnaire to use for general usability measurements [19]. It has beenproven to be a strong measurement of usability and is still reliable when changes to it occur.The SUS still works well if a question is removed completely or if the statements are changedto a positive extreme. If changes are made, it is important to consider these in subsequentanalysis. For example, if a question is removed, the score multiplier needs to be changed toreflect this. [20]

The SUS score is a usability measurement that is easy to grasp. There are still issues withdetermining what an acceptable score is. According to Bangor et al.[19] a score of 50 or aboveis OK, and a score of 70 and above acceptable.

2.4.4 Post-study System Usability Questionnaire

The post-study system usability questionnaire(PSSUQ) is considered to be a good alternativeto the SUS with high reliability[19, 21]. The PSSUQ is also a good measurement of generalusability [21], more specific usability can be measured by analysis of a single item in thequestionnaire. The reason why a single item can be analyzed is, among other things, thatthe questionnaire has a grading scale from 1 to 7. More steps make it easier to analyze theanswers from a single question/statement. Another way that the PSSUQ differs from the SUSis the number of questions. The PSSUQ has 19 items in the main version and 16 items in ashorter variant. [22]

7


The PSSUQ is very useful when used continuously through a design process to determineusability between iterations. It can also be helpful when comparing different solutions toeach other. [22]

2.4.5 User Experience Questionnaire

The user experience questionnaire(UEQ) is a questionnaire the measures user experience andusability[15]. The UEQ can help determine efficiency, but also perceived attractiveness of aninterface[23, 15]. In the original questionnaire, the users are asked to rate 26 items on a 7 stepscale[15]. A shorter version exists, with only 8 items, but it does not measure with as highprecision as the original[23]. The original UEQ has been proven to be quick to fill out eventhough it has 26 items[23]. The questionnaire is designed to measure the overall experienceof an interface and usability. It is not mainly focused on either[15].

2.4.6 Simulator Sickness Questionnaire

Simulator sickness is a common occurrence in simulators and VR. This sickness can presentitself as nausea, dizziness, eyestrain, etc. The cause could be the conflict between visual andvestibular(balance etc.) systems. For example, what is presented in the VR headset does notcorrespond to the users’ actual movement. [16, 17]

A way to measure the amount of discomfort, or severity of the symptoms, is the simulatorsickness questionnaire(SSQ). The questionnaire was presented in a paper by Kennedy et al.called "Simulator Sickness Questionnaire: An Enhanced Method for Quantifying SimulatorSickness." In the paper, the authors categorize the different symptoms into 3 main groups,nausea, oculomotor, and disorientation. They then perform a factor analysis to determine away of scoring the different symptoms, in the 3 groups. The testers that fill out the question-naire grade the severity of their symptoms from 0 to 3. The results from the factor analysisare used to exclude symptoms from some of the categories. The users grading is then used ina formula to calculate the total severity of the symptoms. [16]

The SSQ has been used in other studies as a way of determining the level of simulator sicknessin VR. Oskarsson and Nählinder reworked the SSQ into a questionnaire that is more compactand quicker to answer . They called it the VR-sickness symptoms questionnaire(VRSQ). Theyuse the two symptoms with the highest loading from the factor analysis of SSQ. These symp-toms are, in the nausea category: nausea and stomach awareness. In the oculomotor category:eyestrain and difficulty focusing. And finally in the disorientation category: dizziness witheyes open and closed. The level of severity of the symptoms on a scale from 1 to 7, 1 being thelowest. Oskarsson and Nählinder also added a question regarding the discomfort of wearingthe VR headset to their questionnaire. [17]

The VRSQ will, for the purposes of this paper, serve as a measurement of the users’ humaneffort(see 2.3.1). Since simulator sickness is not all that is considered in this study, the shorterform of the VRSQ, as compared to SSQ, fits the study well.

8

3 Method

This chapter is divided into two larger sections: implementation and evaluation. In the im-plementation section, the development of the system, and how technologies are used, aredetailed. In the evaluation section, the test session and how data was gathered for the actualevaluation is described.

3.1 Implementation

The implementation consists of several main parts: the leap motion part, the flight part, andthe virtual reality part. For all parts of the development, the game engine Unity1 was usedas the development environment. The design requirements were that the user should bestationary in a virtual cockpit, and should be able to fly using HOTAS and interact withobjects in the cockpit using the LMC. The basic Leap Motion parts were implemented first.After the LMC interaction components were in place, the flight simulation and control of thefighter jet was developed. At last, these parts were combined, and VR support was added tothe project. With all these parts in place, the specifics for the user test tasks were implemented.

3.1.1 The Unity Game Engine

Unity was used since it is a popular and robust engine, that supports the technologies thatwere used in the project. The version of Unity used was version 2018.3.5. Leap Motion hasseveral examples and pre-made assets for Unity2(see section 3.1.2). Unity also has built-inVR support and free assets for VR.

For the actual simulation of flight, another Unity asset was used. The Unity Standard As-sets pack3 is an asset pack made by Unity technologies. The included Aircraft Jet asset wasadjusted to be used with a custom 3D model and to be controlled via a HOTAS setup. Thisairplane controller asset is not a truly realistic flight model, but it is sufficient for this project,to evaluate the combination of VR, HOTAS, and the LMC.

1https://unity3d.com/2https://developer.leapmotion.com/unity/#54363563https://assetstore.unity.com/packages/essentials/asset-packs/standard-assets-32351

9

3.2. Evaluation

To create an interesting, immersive environment to fly around the built-in Unity terrain en-gine was used. With this tool, a terrain can easily be made by supplying a height map andadjusting some parameters, such as size and resolution. Modifications such as applying tex-ture to the terrain were made. The textures that were used were the terrain asset texturesfrom the Standard Assets pack.

3.1.2 Interaction and the Leap Motion Controller

The main interaction components were constructed for interaction with hands using theLMC. The LMC can be placed on a table, facing up, or mounted to the front of an HMD.For this study, the head-mounted configuration was used. The components used in the sys-tem were based on the basic UI components from the leap motion interaction engine, madefor Unity4. The main components of the cockpit are: buttons, dials, and flip switches. Basicbuttons come with the interaction engine examples, but the other components were made forthe specific purpose of this thesis. The custom components were based on and made in a sim-ilar style as the ones supplied by the interaction engine. The goal was that the componentsshould, as much as possible, use physics and not simply animate into place, with the intentto compensate for lack of haptic feedback.

3.1.3 Implementing Virtual Reality Support

During development, two different VR HMDs were used. At first, the basic VIVE5 headsetwas used. During the later parts of development, a prototype version of Varjo’s VR-16 headsetwas used instead.

To implement VR in Unity, some additional assets and programs were needed. The SteamVRplugin asset7 was used to provide basic support for VR in Unity. The asset also comes withsome examples in Unity. In addition to this plugin, the program Steam and SteamVR8 neededto be installed. The plugin asset provides a basic VR camera rig and ways of handling inputfrom VR controllers if needed.

To implement support for Varjo’s prototype HMD their plugin9 for Unity was used insteadof the SteamVR plugin. It is built upon SteamVR and works in similar ways. A specialconsideration when working with the prototype HMD is that it has two types of screens.One for the focus area or center of view, and one for the peripheral parts of the view. Thefocus area screen has a higher resolution than the peripheral one.

3.2 Evaluation

In this section, the method for the evaluation of the implemented system will be described.Also, the tasks the users were asked to perform with the system, and the structure of the testsession are explained.

3.2.1 Tasks and User Test Session

The user was asked to perform a set of tasks for the usability of the system to be measured.The tasks mainly consisted of regular tasks similar to, or related to fighter jet simulations andtraining(see User tasks section below). The details and steps in the tasks are constructed for

4https://leapmotion.github.io/UnityModules/interaction-engine.html5https://www.vive.com/us/product/vive-virtual-reality-system/6https://varjo.com/vr-1/7https://assetstore.unity.com/packages/tools/integration/steamvr-plugin-326478https://store.steampowered.com/steamvr9https://varjo.com/use-center/developers/unity-page/

10

3.2. Evaluation

this prototype and study. An example of such a task is a starting sequence, which, in thisstudy, has the user flip some switches in order. This task fits this evaluation well since itencompasses most of the main ways of interacting with the system.

User-based testing, as suggested by Ghasemifard et al. [12] was used in this study(capture,analysis, and critique). They estimated that a minimum of 8 users is needed to get enoughdata. The data recorded was completion time and errors made during the tasks. In addition,the users were encouraged to think-aloud, according to the think-aloud method [24]. Theywere asked to remark on what they were thinking related to the system and if anything wasconfusing. These comments were recorded by the test administrator. This was done to getcomments and data from the users during the test session, and not only in the post-test ques-tionnaire. In addition to what the user said aloud, other relevant observations made duringthe test sessions were noted by the administrator.

At the start of the test session, the users were asked to fill out the first part of the question-naire. This part is regarding the previous experience with VR, and the VRSQ to get a sense oftheir regular level of the symptoms. For the initial VRSQ, the fifth item, regarding discomfortof wearing the headset, was removed since it does not apply.

Before any of the actual tasks were performed by the user, they were asked to interact witha tutorial environment. This environment consisted of a panel with buttons, knobs, andswitches. The purpose of this interaction was to familiarize the user with different inter-action components. There was no goal with the tutorial environment or no stated task. Whenthe user felt like they had a basic grasp on how to interact with the different components, andwhat type of interaction that was expected of them, they were moved to the virtual cockpitenvironment for the actual tasks.

User Tasks

The first step for the users was to perform a reading test. This was done by reading textpresented on a screen in front of the user. This part was meant to test if text displayed inthe virtual world could be read by the user. The users were presented text with 5 differentheights, each on its own line. The text was placed in world using TextMesh Pro10 in Unity.The text was organized like a Snellen chart11 used in sight examinations, with the largest textat the top. The goal of the tasks was for the user to try and read the text in front of them, andtell the test administrator if the text of a specific size could be read or not. The five differentsizes were about 12, 9, 6, 3, and 2 millimeters in the virtual world. The millimeter sizes wereestimated by using standard 3D cubes in Unity as a reference. The text was placed on a flatsurface(screen) in the virtual world about 65 to 50 centimeters in front of the camera(user).

The second step was to execute the starting sequence. The users were instructed to flip threeflip switches in a set order, to complete the sequence. The test administrator measured thetime from the go-signal until all the switches had been flipped into position. After the startingsequence, the users could throttle up and take off. This test mainly evaluates the leap motionand how well interactions can be done with it in VR.

When airborne, the users were asked to localize a target with the help of an overhead mapview in front of them. The goal for this section of the test was to line up next to the target,meaning match altitude and speed. The user’s altitude, as well as the target’s altitude, weredisplayed on the screen in the cockpit. The speed was not displayed. This part evaluated thespatial awareness in the virtual world and the HOTAS/VR combination.

10https://docs.unity3d.com/Packages/[email protected]/manual/index.html11https://en.wikipedia.org/wiki/Snellen_chart

11

3.2. Evaluation

When the users had aligned with the target, the goal was to perform air refueling. The stepsincluded in the refueling sequence in the prototype are:

• Make sure your aircraft is aligned with the target

• Extend the boom

• Attach the boom to the hose, by flying close to it

• Keep it attached(don’t fly too far away) for 10 seconds while refueling

If, and how many times, the user detached from the hose prematurely, by moving too faraway, was noted. This final part of the test was meant to test the combination of all thedifferent technologies(LMC, VR, and HOTAS), and how well a user can perform a task byusing these.

3.2.2 Measuring Execution Time

To measure execution time, and get a sense if the users time recorded during the tasks(see3.2.1) was good or bad, a baseline was needed. To get a expected time for a part of a task,Fitts law(Eq. 3.1) was used(see Section 2.4.1), where MT is the expected time for a movement.The variable D is the movement distance from start to the target, and W is the width of thetarget. The distance in the virtual world is represented by units, where 1 unit represents 1meter. Meaning, the standard cube 3D object in Unity, is 1 unit wide, which corresponds to 1meter.

MT = a + b ¨ log2(2DW

)[ms] (3.1)

Bachmann et al. [13] used Fitts law in their study related to the LMC. Since this thesis onlyuses Fitts law as a baseline, and mostly the same hardware as Bachmann et al. were used,their constants for a and b in equation 3.1 was used, a = 369.12 and b = 193.28. The equa-tion(3.1) was used to calculate an estimate of execution time for applicable tasks.

3.2.3 Questionnaire

After the users had performed the tasks they were asked to answer a questionnaire. Thegrading scale for these two questions was 1(no experience) to 7(very frequent use). The finalversion of the questionnaire can be found in Appendix A. The original form of the SUS[14],with 10 items had the following statements:

1. I think that I would like to use this system frequently

2. I found the system unnecessarily complex

3. I thought the system was easy to use

4. I think that I would need the support of a technical person to be able to use this system

5. I found the various functions in this system were well integrated

6. I thought there was too much inconsistency in this system

7. I would imagine that most people would learn to use this system very quickly

8. I found the system very cumbersome to use

9. I felt very confident using the system

10. I needed to learn a lot of things before I could get going with this system

12

3.2. Evaluation

The version that was used in this study used 9 of the original 10 items. Statement 4 wasremoved since it was not relevant with regards to the intended user of the system. Anotherchange from the original was the number of steps on the grading scale. Instead of the original5 step scale, a 7 step scale was used. The motivation behind this was that the greater amountof steps makes it easier to examine the answers of a single item in the questionnaire[22].

The reason for why SUS was chosen over other questionnaires such as UEQ or PSSUQ wasits short form and ability to quickly measure general usability. Also, its resistance to smallerchanges, such as removing a question was considered. Many of the statements in the SUS caneasily be related to specific measurements of usability such as completeness, precision, andhuman effort.

Another part of the final questionnaire was the evaluation of simulator sickness. The shorterversion, VRSQ[17] was used to evaluate the simulator sickness in virtual reality. For this part,the users were asked to rate their experienced severity of 4 symptoms and general discomfortof wearing the HMD, on a scale from 1 to 7. This scale is the one used by Oskarsson andNählinder[17], and fits well with the grading scale used on the modified version of SUS(1-7).The items in the final questionnaire from the VRSQ were:

1. Nausea

2. Stomach awareness

3. Eyestrain

4. Difficulty focusing

5. Discomfort of wearing the headset

The severity of all of the items above was rated on a scale from 1(none) to 7(very severe).

13

4 Results

This chapter presents the results from the process of implementing and designing the proto-type, as well as the results obtained from the user test session. In the implementation section,the results of the implementation process, as well as decisions made during development,are detailed. In the evaluation section, the results from the user tests and questionnaires arepresented.

4.1 Implementation

In this section, the results of the development process are presented. Specific details regardingsome interaction components and information regarding what functionality is included in theprototype are also explained.

4.1.1 Interaction Components and Leap Motion

As mentioned in Chapter 3, the interaction components, made for use with the LMC wereimplemented first. The Leap Motion interaction engine for Unity served as a basis for thesecomponents. The basic button components came with the engine, but the others were madecustom for the prototype. The tutorial environment used in the test session has all of theinteraction components available in the prototype integrated. A screenshot of this environ-ment can be seen in Figure 4.1. On the right side are the buttons from the interaction engineexamples. On the left side, the flip switch and knob can be seen.

All of the components change color(see section 2.2, regarding object coloring) when a hand isclose by. This type of interaction is called hover and comes included with the standard handassets. The goal with this is to indicate that the hands are close to an object, and give betterspatial perception when interacting with the LMC.

14

4.1. Implementation

Figure 4.1: The interaction components in the tutorial environment

The Knob Component

The first custom component that was made is called the knob. This component is a turningknob, that snaps into steps, similar to a clocks hour hand. The component was made witha flattened cylinder base and a rectangle on top. Both are primitives included in Unity. Thecomponent can be seen in the bottom left of Figure 4.1. The dark rectangle is the interactionpart that the user is supposed to grasp. Grasping is a standard interaction in the Leap Motioninteraction engine. Grasping of an interaction object is automatically detected if the standardhand models and accompanying objects from the interaction engine are added in Unity. Inthe first version of the knob, the user was meant to turn the rectangle part of the knob directly.This, however, did not give a good enough interaction, since the LMC lost track of the handwhen twisted in certain ways. Also, the user had to make rather big motions to rotate therectangle. Instead, an alternative way of interacting with this component was designed.

It was determined during development that maintaining a grasp for the duration of the in-teraction, while not actually grasping a physical object did not feel truly natural, and aftera while, it became uncomfortable to maintain a grasp. To solve the problems with the firstversion of the knob, a larger version of the knob was displayed near the original knob andremained there until an interaction was made with it. This larger version, in its final state,can be seen in Figure 4.2.

Since the knob was designed to snap into a given set of steps(a step is 360 degrees dividedby the number of steps), the buttons seen in Figure 4.2 was added on the larger cylinder,as an easier way of interacting with the component. The buttons correspond to the rotationof the rectangle on the original components. When a button is pressed, the original compo-nents rectangle is rotated into the corresponding place, and the larger knob, with buttons,disappear.

The second iteration of the knob gave an appropriate response, meaning that the knob turnedto the desired location. But, it was a bit cumbersome to interact within the tight space of acockpit. Therefore, a third, simpler version of the knob was implemented. This third ver-sion(the same tutorial model as in 4.1) also rotated in set steps, but the interaction was differ-

15

4.1. Implementation

Figure 4.2: The second version of the knob component

ent. Every time a part of the hand collided with this version of the knob, it rotated a step. Thisversion rotated a step when the user entered the knobs collision volume with a part of thehand or finger. For the user to rotate many steps, they had to enter, exit, and enter collisionvolume of the knob repeatedly. This interaction can be performed by small movements, suchas tapping, of, for example, a finger.

The Flip Switch

The flip switch component can be seen in the top left of Figure 4.1 and 4.2. The switch flipsinto one of two positions. In Figure 4.1, the initial position is shown, this is not one of thetwo positions. The two modes the flip switch can be put into is ˘45 degrees from the initialposition, as seen in Figure 4.2. The user flips the switch by moving it with a finger. When theswitch has been moved about 10 degrees from one of the two end position, towards the other,the flip switch snaps into the other one.

The switch is implemented by using the built-in Hinge Joint1 in Unity. The joint has a springattribute, in which a parameter for the desired angle can be set. The joint will then asserta given amount of force to be in this desired angle. In the flip switch implementation, thesnapping into a different end mode is done by changing the desired angle on the joint’s springto ˘45 degrees.

A specific detail that was made for the flip switch was regarding having many flip switchespositioned close to each other. When positioned in this way, accidental interaction with an-other switch can be made. This is solved by only having the user be able to interact with oneswitch at a time. To make sure that only one switch at a time was interacted with, the hover-ing interaction method supplied in the leap motion interaction engine was used. The hoverinteraction has an additional attribute called primary hovered, which was used to make surethat the user only interacted with one switch. Primary hover is an attribute that can only beapplied to one interaction object at a time.

1https://docs.unity3d.com/Manual/class-HingeJoint.html

16

4.1. Implementation

4.1.2 Implementation of Flight Controls

The flight(aerodynamic) model used is the one from the Unity standard assets pack, men-tioned in Chapter 3. These flight controls enabled keyboard and mouse control directly buthad to be modified to work with the HOTAS setup. Unity’s built-in input manager was usedto get input from the throttle and stick. The input sensitivity and some parameters on thecontroller script were modified to give a better feel when controlling the plane. Some of theseparameters are dead zone and sensitivity on the input, and also parameters on the controlscript regarding the input effect on pitch, roll, and yaw.

4.1.3 Virtual Reality Implementation

To implement support for VR, the assets and programs mentioned in Chapter 3 needed tobe installed. A component from the SteamVR plugin was added to the project and sceneso that the view could be rendered to the headset. When VR is enabled, the main camera iscontrolled by the movement and rotation of the VR headset. Some problems occurred becausemovements of the main camera were made from scripts. The movements made were mainlyfollowing the rotation and translation of the plane. The problems were solved by creating aparent object for the camera and instead move the parent object from scripts. With this setup,the VIVE HMD could be used with the prototype.

During development, an informal test of the VIVE HMD was made. The test was similar tothe one performed by the user during the user test session, regarding if text were readable.It was determined that the displays in the VIVE headset were not sufficient for reading all ofthe different sizes of text. Text of 6 millimeters size could be read with the VIVE, but withdifficulty. It was therefore decided that instead of using the VIVE HMD, a prototype of theVarjo VR-1 headset were to be used.

To implement support for the prototype HMD, steps similar to the implementation of theSteamVR plugin were performed. The main difference is that VR support in Unity had to beturned off, and another asset and other scripts were used for handling camera movement,head tracking, and rendering to the HMD.

When the same informal reading test was made with Varjo’s prototype, all of the differentsizes of text could be read. The 2 millimeter size text could be read, with minor difficulty,if it was placed in the focus area in the center of the view. Worth noting is that these infor-mal tests were not performed in the same setup used in the more formal user test sessions.The informal test was performed without calibrating the HMD or adjusting the straps whilestanding.

4.1.4 The Final Prototype

In the version of the prototype, that the user tests were performed on, all of the featuresdescribed previously were included. The features are not true to life and are designed for thisprototype, with the intent of being similar to what could be present in a real-world cockpit.In the prototype, the user was seated in a virtual fighter jet cockpit, and a prototype versionof Varjo’s virtual reality HMD was used as a display. The users could start the plane byperforming a set starting sequence(described in 3.2.1), by interacting using the LMC. Theuser could interact with a knob component to extend or retract a boom, for aerial refuelingpurposes. The interaction components included, were limited to ones that were necessary forthe user tasks.

The jet was controlled via a stick and throttle. Additional information, such as an overheadmap view and altitude, were presented on a screen in front of the user. The reasoning be-

17

4.2. Evaluation

hind including these was to better facilitate spatial awareness, and the user’s perception ofproximity to the refueling plane, and its fuel hose. How this information was presented inthe cockpit has either been obtained from publicly available sources on the internet, or con-structed for this prototype, and is not accurate to how it is presented in a real cockpit. Anoverview of the cockpit can be seen in figure 4.3, with the flip switches to the right, marked1,2, and 3.

Figure 4.3: Overview of the cockpit

4.2 Evaluation

In this section, the results from the user test sessions are presented. This includes post-testquestionnaire answers, measured time for the user tasks, and a compilation of comments andfeedback from the think-aloud protocol.

4.2.1 Questionnaire Answers

The answers to the different parts of the questionnaire are presented in separate tables andfigures. In Table 4.1 the users rating of their previous experience is presented per users.The users are referred to by Ux continuously through this thesis, where x is the identifyingnumber of the user.

User VR experience Flight simulation experienceU1 3 5U2 2 2U3 5 7U4 3 7U5 4 4U6 2 7U7 1 7U8 2 7

Table 4.1: Previous experience per user

18

4.2. Evaluation

In Table 4.2, the users’ answers to the VRSQ section of the two parts of the questionnaire, isshown. The number is the difference between the post- and pre-test answers. A value of 0means no change in symptom severity. A positive value means an increase, and a negativevalue a decrease in severity.

User Nausea Stomach awareness Eyestrain Difficulty focusingU1 0 0 0 2U2 3 -1 1 1U3 2 1 1 0U4 0 0 -1 0U5 1 0 0 0U6 0 0 0 0U7 2 0 1 1U8 0 0 0 0

Average 1 0 0.25 0.375

Table 4.2: Difference between post- and before-test VRSQ answers

In Figure 4.4, the average of the users’ answers to the post-test part of the questionnaire ispresented. The standard deviation for each of the questionnaire statements is also shown inthe graph. Statement 1(S1) to statement 5(S5) is the VRSQ part of the questionnaire, from S6forward are the SUS statements.

0

1

2

3

4

5

6

7

8

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14

Sco

re

Statements in Questionnaire

Post-test Questionnaire Answers(average)

Figure 4.4: Average of Post-test questionnaire answers

The SUS score for the averages is calculated from the formula mentioned in Section 2.4.6.Since a statement was removed from the questionnaire, the score multiplier needs to beadapted. With the 7 step scale, a maximum score of 54 can be obtained without the mul-tiplier. To get a final SUS score between 0 and 100 the score is multiplied by the factor 10054 .

The prototype got a SUS score of 75 from the post-test questionnaire answers.

19

4.2. Evaluation

4.2.2 User tasks and execution time

The results from the user tasks are shown in Table 4.3. Worth noting about the reader testis that none of the users that could read level 5, the smallest size of text, without placing itwithin the HMD’s focus area. Meaning that they had to move their head, and not just theireyes, to read the smallest text. The execution time for the starting sequence is from the testadministrator’s go signal, until the third, and last flip switch was flipped.

User Reading test level Starting sequence time [s] Refuel detachmentsU1 4 5.15 0U2 5 13.9 3U3 5 6.45 3U4 5 7.6 2U5 4 3.75 2U6 5 5.5 1U7 4 9.3 0U8 5 3.8 0

Average 4.625 6.931 1.375

Table 4.3: User task results

The expected time for the starting sequence is calculated using Fitts’ law, as detailed in Section3.2.2. The width of the flip switches were about 0.006 meter. 3 different expected times werecalculated and summed up since the movement could be separated into 3 sections. First, theuser had to move their hand from a neutral position to the first switch. After that, betweenthe first and second switch, and finally between the second and third switch. These distanceswere: 0.41, 0.035, and 0.024 meter. The distances were estimated with Unity standard cubesas a reference, as mentioned in Section 3.2.2.

Using these variables, and using the formula, as described in Section 3.2.2, estimated timefor hand movements of about 3.743 seconds are obtained. This is not including any othermovements such as head or eye movement.

4.2.3 Observations and think-aloud feedback

During the test session feedback from the users and observations were noted. The users’remarks were regarding all of the parts in the prototype. Much of the feedback was regardingflight controls or the flight model. This feedback will not be presented here since it is notrelevant to the study.

When the users were placed in the cockpit, many of them tried to interact with a lot of non-interactive components. One user(U1) remarked that if you see something that looks likeyou can interact with it, you want to try to interact. Even the components that were clearlyinteractive, some users misunderstood the possible interactions. User U2 wanted to grab theswitch with a pinch gesture. Both users U4 and U6 felt like interacting with their index fingerfelt unnatural. They both wanted to flip the switches with their thumbs, which caused sometracking issues.

When asked, most of the users were positive to the hand representation and tracking of handsvia the LMC. Some users remarked on the precision of the tracking, and that the movementsof fingers felt very natural. A lot of feedback was obtained regarding improvements to theinteractions. Many users felt that additional feedback, other than object coloring, was needed.Some users thought that either audio feedback or a physical mockup, that provide physicalcontact as feedback, could improve the interactions.

20

4.2. Evaluation

Another problem regarding visual feedback was that the object coloring did not assist theusers properly regarding depth perception. The test administrator observed that many usershad problems with determining the distance to interaction components in the tutorial. Mostof the users seemed to be better at determining the distance between their plane and therefueling plane, than between their hand and a flip switch.

No feedback was given regarding problems with the VR part of the prototype. The onlyfeedback gotten was that most users felt that being able to look all around them was positive,and provided good spatial perception. Some users remarked the sense of speed was lackingin the prototype, and as a result was hard to control the speed of the plane.

21

5 Discussion

In this chapter, the results obtained and method used will be discussed. In the first part,the results presented in Chapter 4 and what these results indicate about the usability of thesystem will be examined. In the second part, the method used to implement the system, aswell as the method used to gather data will be discussed.

5.1 Results

In this section, the different parts of the result and how they relate to the metrics for efficiencyand operability are discussed. Also, some discussion regarding what could have been donedifferently is conducted.

5.1.1 Analysis of VRSQ answers

The VRSQ part of the questionnaire showed a minor increase in severity for some of the users.The average increase was very low or almost non-existent. One reason might be that the useris in control of the movement of the plane, and if they have some experience related to flight,no unexpected movement occurs. Another reason can be that the users were not requiredto perform any complex flight maneuvers. Most of the flight part of the test sessions hadthe user fly straight behind the refueling plane. There does not seem to be any evidence for adirect link between previous experience and severity of symptoms in the VRSQ. Worth notingis that the only user who had below 4 in previous flight experience, and less VR experience(2),is the one who experienced the greatest increase in symptoms overall. This might indicatethat overall experience with a similar system, whether it is VR och flight simulation, mightincrease tolerance for the symptoms in the questionnaire.

One statement in the VRSQ part of the post-test questionnaire that has a much higher av-erage score, than the rest of the symptoms, is the discomfort of wearing the headset. Thisstatement(S5) got an average of 3.375. The HMD used is a prototype version and does nothave the same comfort as a final product, and is heavier than, for example, the Vive VR head-set.

22

5.1. Results

5.1.2 Post-test Questionnaire

The overall SUS score can be used to determine the general usability of the system. The scorefor the prototype was 75, which is deemed to be acceptable [19]. This is a good indicator thatthe technology used in the system is good and usable. Individual statements in the ques-tionnaire can be used to give an indication of the prototype’s operability, with regards tocompleteness and precision. If a system is easy to use, the correct and desired tasks should beable to be performed. Therefore, statement S8 can be related to both precision and complete-ness. Statement 8 in the post-test questionnaire got an average score of 5.75, which is ratherhigh.

If the system is to complex(S7), and the user does not feel confident using it(S13), the precisionof the system can be determined to be low. S7 got a low score(1.625), and S13 got a highaverage score(5.25). This should, together with the ease of use with the system(S8), indicatethat the technology combination used in the prototype enables to user to interact with highprecision.

The completeness of the system is harder to determine. If the different part of the system arewell integrated(S9), and few inconsistencies exist(S10), it can be determined that the systemhas high completeness, and as a result, high operability. The score for statement 9 was in themiddle of the scale. The score was 4.75, with 4 being the exact middle of the scale. It is stillpositive but could be higher. The average score for statement 10 is quite low, 2.25. These twostatements together indicate that the system has high completeness, but is not perfect. This isto be expected for the first version of a prototype.

5.1.3 User task result

The precision of the system can be related to more than just interacting with components.Being able to correctly read the information on, for example, a screen is important to achievehigh precision. Almost all of the users could read the smallest text during the reading test,which gives an indication that the user can read the information on the screen with very highprecision. This also indicates that the problem regarding readability in VR, that was detectedin the previous study conducted at Saab(Section 1.1), is solved by using a headset with adisplay of higher quality.

The execution time for the start sequence is much higher than the estimated time. A slightincrease is to be expected because the head movement and rotation was not part of the esti-mation. The users’ previous experience with VR should reasonably relate to their ability tosense depth and interact in VR, but this does not seem to be the case. What could be inter-esting to look at if the cockpit was designed to be true to life. A majority of the users haveextensive flight simulation experience, and if the cockpit is closer to what they are used toeither in simulation or in a real-world cockpit, their execution time could decrease. To assistwith the depth perception, additional visualization of proximity to interaction componentscould be examined. In the prototype, the object changes color when the user is about 20-30cm away, which is too far away to interact. An object coloring that fades depending on dis-tance is one idea. Shadows for the hand, as suggested by Vosinakis and Koutsabasis[11] andadding an arm or wrist with shadows could be a good way of visualizing distance. Using theobject halo method would probably give a similar result as the object color method used inthe prototype.

Even though many users remarked that the controls did not feel very good, and it was hard todetermine speed, the average detachment during refueling was low. The users had a chanceto get a feel for the controls while approaching the refueling plane, and when attached, mostof the users could stay close to the plane. Why the performance on this part of the user tasks

23

5.2. Method

was good, can be related to the users’ comments on the ease of determining the distancebetween the two planes in VR.

During the test sessions, many users remarked on the flight control. This was not the focusof the study and might have altered some of the answers in the questionnaire. In the end,the flight part of the prototype was good for having the user feel the spatial perception inVR. If they were placed in a stationary cockpit and asked to perform different tasks, somemore reliable data regarding the usability of the Leap Motion interaction could be gathered.By doing this, a similar context could be used, but more specific data could be obtained. Byenabling the user to fly, a more complete and more immersive experience could be created.

5.2 Method

In this section, the method will be discussed and analyzed. In the implementation section, thedesign guidelines used and the hardware used is discussed. In the User Test and Evaluationsection, the user test sessions, and the choice of questionnaire is analyzed.

5.2.1 Implementation

The implementation consisted of several parts, as previously stated. The main one being theLeap Motion interaction components. The design of the flip switch and the knob compo-nents were based upon some previous work and guidelines. As mentioned in Section 3.1.2physics in Unity were used to give the user some feedback from their interactions. This isnot something that has been studied previously or is supported by scientific sources. Thephysics-based interaction sprung from Leap Motion’s examples, in which they used physicsfor button interaction. This way of implementing the interaction components was chosenbecause no real support for detailed specifics on how this type of interaction with the LMC,should be designed was available. The guidelines for interaction that were found, are theones mentioned in Section 2.2. In the cockpit prototype, most of these guidelines were fol-lowed. The interactions were performed with one hand at a time, and the users were neverasked or instructed to perform multiple hand interaction. However, since the componentswere placed to the sides of the user, some rotation and movement of the head was required,as well as motions other than shoulder motions.

Another important guideline is how to visualize what can be interacted with. The chosenmethod was object coloring. This is implemented in the prototype by having an object changecolor when the user moved a hand close to it. According to Vosinakis and Koutsabasis [11],the object coloring method and the halo method were the two techniques that gave higherusability. The object coloring method was used because it provides higher performance ac-cording to Vosinakis’ and Koutsabasis’ study [11], and is easier to see on a smaller object,since a bigger part of the object changes color.

The choice of HMD was at first the VIVE headset since it was the one available at the time.During development, it was determined that the VIVE HMD was not sufficient to solve someof the problems identified in the previous study conducted at Saab(see Section 1.1). A smallinformal reading test showed that Varjo’s prototype headset was better suited for the task athand. The design method of the prototype did not change between the two HMDs, it wasmerely an upgrade in the quality of the display.

24

5.2. Method

5.2.2 User Test and Evaluation

The tasks performed by the user evaluated different parts of the prototype, such as LMC andVR HMD. The structure and steps of the tasks were designed for this prototype and weremeant to be similar to what can be done in a real cockpit. By doing this, the LMC couldbe evaluated in the context of flight simulation and cockpit in VR without having to adhereto the features in the real cockpit. A downside with this approach is that the steps mightfeel unnatural or unusual to someone who has experience with cockpit interaction or flightsimulation.

Fitts’ law(see Section 3.2.2) was used to get a measurement of the expected time for the handmovements of the user in the starting sequence task. If this type of reference is not used, it canbe tough to determine if the performance of the user during a task is good or not. The samegoes for the refueling task. The only thing that is measured in the refueling task is the numberof times the user detaches, which hopefully relates directly to previous experience with VRor flight simulation since it is a matter of piloting an aircraft and determining distance in VR.

The objective data collected during the test sessions were complemented by keeping a pro-tocol of feedback and remarks during the sessions. This data can be very valuable if thetesters have experience with the type of system they are evaluating, which was the case inmost of the test sessions in this study. In addition to what was said, observations made bythe test administrator was noted. One of the key ways of evaluating a system and its us-ability is through observing how users interact with the system[7, 10, 12]. A downside withthis way of collecting data is that if the session is not recorded, and a single test adminis-trator is present, information might be missed. Having more administrators, or observersfamiliar with the prototype present during the test session, or recording the user and thescreen, would have been good sources of information to have. Unfortunately, neither of thesewere easily obtainable for the test sessions. If resources are limited, many different sourcesof information regarding the usability of a system can be valuable. If many separate sourcesof data point to the same thing, a reasonable conclusion can possibly be drawn. By usingobjective data such as execution time, feedback, and observations made during the test ses-sions, and questionnaires, reliable data regarding the usability of a system can be gathered.A flaw with subjective data obtained from user feedback and from observing users is that thestudy might be hard to replicate, and therefore, parts of the study might not have very highreliability. Even if the same test was performed with the same users, simply replacing theadministrator might change some observation data. However, this does not mean that thedata gathered is useless. Even if the data is not complete, it can still have value.

5.2.3 Choice of Questionnaire

Finding a questionnaire that measures usability, is reliable, and measures what is desired isnot an easy task. Constructing a custom questionnaire that surely measures what is desiredis possible. However, the reliability of an untested questionnaire is low. Pre-made usabilityquestionnaires were chosen to ensure reliability in the study. To measure what was desireda combination of two questionnaires was used. The shorter version of the tested SSQ, calledVRSQ by Oskarsson and Nählinder[17], was well suited to work with the version of SUSused. VRSQ is used in this study to measure human effort.

It was important to not overload the user with questions and statements in the question-naires, as stated by Chisnell and Rubin [18]. Therefore, questionnaires with a low numberof statements or questions were desired. Bachman et al. [7] suggested using SUS and UEQfor evaluating interactions with the LMC. Another questionnaire that is a good alternative toSUS is PSSUQ(See 2.4.4). SUS was chosen over both of these for several reasons. First of all,SUS is much shorter than both of the other candidates, even in their shorter versions. SUS

25

5.3. The work in a wider context

has been used for similar studies, which puts it above PSSUQ. A problem with UEQ is that itmeasures user experience and usability, and is not entirely devoted to either. SUS, therefore,is a more precise measurement for usability, which was desired for this study. One advan-tage with PSSUQ is that it uses a seven-step scale, instead of SUS’s five-step. However, sinceSUS is resilient to changes [20] the questionnaire could easily be modified to use a seven-stepscale(see final SUS questionnaire with a seven-step scale in Appendix A, page two).

Some of the statements in the SUS questionnaire can also be related directly to some of themeasurements of usability, such as completeness and precision(See section 1.3). For example,statements regarding how well functions are integrated and inconsistencies in the system cangive an indication of the system’s completeness. How easy the system was to use, the com-plexity of the system, and how confident the user felt, can be used to determine the precisionof the system and its associated interactions. This examination of individual statements aswell as a general score for overall usability should at least give an indication of the usabilityof the system and its components.

The validity regarding general usability is good in the study. This is however, not as easy todetermine when looking at the more specific metrics(completeness, precision, human effort,and execution time). This is why many sources of data were desired for this study. Forexample, a questionnaire on its own, measuring precision, might not provide high validity tothe study. To complement this source of data, the other sources, mentioned previously wasused. The thought behind this was: if many sources point to the same conclusion, it shouldbe valid.

5.3 The work in a wider context

It is hard to examine what an impact a study conducted in a particular context could havein a wider context. If strictly looking at the technology present, wider spread use of VR andfree-hand interaction, could better facilitate different types of training. Whether it is flighttraining or other types, such as medical training. Using VR is an efficient and safe way ofperforming this type of training.

Another aspect to consider with this technology are health issues. Modern VR technology isnot that old, and long term effects of it have not been examined. Some researchers state that itmight affect eye growth[25]. This might be one reason why VR companies discourage youngpeople from using VR[26].

Many VR solutions can be cheaper than other alternatives. In Section 1.1 a bigger non-VRflight simulation was mentioned. It seems like many of the consumer grade VR solution thatis on the market, are much cheaper than this solution. If a cheaper solution can be used, thatmore efficiently do some of the parts that the more expensive version does, it seems like agood option.

26

6 Conclusion

Based on the SUS score for usability, and data from the user tests, it can be concluded thatVR is a great way of displaying the virtual world. Hand interactions add to the user feelinglike they are immersed in the virtual cockpit. Even though there are parts lacking in theprototype, the overall interactions, and spatial perception work well.

With regards to the interaction components, it can easily be determined that more feedback isneeded. What type is very hard to conclude, audio feedback is one suggestion, and physicalbutton mockups are another, both suggested by users. The hypothesis that using physics forthe interaction components would increase usability is not well supported. There might betwo reasons for this. One, the components were quite small and placed in such a way thatthe user could not really see the movement of the components while interacting. Two, thephysics of the components are not very refined and could use some iterations to get right.What can be said about the free hand interaction though is that having hand representationsin the virtual world is a good addition. Even though the components might not be perfect,and their implementation did not get much feedback from the users, almost all of the userswere impressed and very satisfied with the hand tracking and representation. This indicatesthat the technology is right for this context, even if the implementation is not perfect.

If the results of the VRSQ are directly related to efficiency in human effort, as defined inSection 2.3.1, the results are very positive. For this prototype, the human effort invested islow, and therefore have a rather high efficiency in human effort. As mentioned previously,this might be because the user tasks were not designed to try and cause simulator sickness.This was not the goal of this part of the study. The goal was to determine how the prototypecould be implemented to have high efficiency in human effort. Having the user stationary, infull control of the plane’s movement, and able to move their head and look all around them,seems to be a contributor to high efficiency in human effort. Also, some previous experiencewith flight seem to have a positive impact on human effort.

27

6.1. Future work

The execution time was not as expected. The higher time could be avoided by designing amore robust interaction component. If the users could interact with the component in a waythat feels easy and natural to them(for example, by using their thumb) the execution timemight decrease. The efficiency in execution time for the prototype is deemed to be low.

As discussed in Section 5.1, the questionnaire answers indicate that the system has high preci-sion. The completeness, based on the questionnaire, is also good, but not as high as is desired.If the feedback given from the users and observations made are considered, the lower com-pleteness can be related to the incomplete cockpit and interaction components. The userswanted to interact with more parts of the cockpit and seemed to need a better way of visual-izing what could be interacted with.

It is hard to give a final answer to the research question: "How can a virtual reality simulationfor a plane cockpit be designed so that user interactions have high usability in terms of userefficiency and operability?" The prototype had high general usability but not with regards toall of the specific metrics. This could be increased by creating a more complete experiencewith more components. The most important part when working with free-hand interactionsin VR is to visualize clearly what can be interacted with, and give proper feedback when aninteraction has been made. Simple object coloring is not sufficient for a system in this context.It is also important to elicit what type of interactions are common, and what feels natural forthe user in a given context. This could be done with a prototype, similar to what has beendone in this thesis. Natural feeling interactions and having the system and its componentsbehave in an expected way seems to be the best way of designing a cockpit in VR with free-hand interaction, that has high usability.

6.1 Future work

As stated previously the prototype and the interaction components needs more iterations andmore testing. Several suggestions for how they could be designed was given by users. Themost common ones were audio feedback for interaction and creating some sort of physicalmockup. This all gives an indication that more research on what type of feedback is sufficientneeds to be done.

The hypothesis that physics-based interaction would give high usability did not hold. Theoverall usability was high for the prototype, but this can not really be traced to this specificpart of the design of the interaction components. Further work on how real-world compo-nents should be implemented in a virtual world needs to be done. Having some designprinciples to use as a base would greatly improve, and make this type of development for VRwith free-hand interaction, much easier.

If we look at the specific context of implementing a cockpit in VR, some guidelines werefound. Keeping everything as natural to the user as possible seems like a good way of design-ing this type of interactions. What can be done is to explore further what type of interactionsare natural in the context of a cockpit, and which of these are possible with the technology athand. Using a thumb to flip a switch in the prototype did not work, mostly because of theplacement of the Leap Motion sensor. Different tracking sensors and alternative placementof the Leap Motion sensor is another way of further developing free-hand interaction in VR.Possibly using multiple sensors around the user, in addition to the one placed in front of theHMD, would enable a wider range of interactions. In the current prototype, the user can notinteract with components that they are not directly looking at, which is not the most naturalway of interacting with all of the components.

28

6.1. Future work

Finally, it would be interesting to see how other methods of tracking the user’s hands com-pare to the LMC. For example, using VR gloves1 or similar technology, in the same context,would be interesting.

1https://manus-vr.com/gloves/

29

Bibliography

[1] Zaid Selman. The very real growth of Virtual Reality. URL: https://www2.deloitte.com/xe/en/pages/about- deloitte/articles/treading- water/the-very-real-growth-of-virtual-reality.html (visited on 11/19/2018).

[2] Marco Remondino. Virtual reality and immersive simulation technology outside video gam-ing: Enterprise applications and potential implications. University of Genova, 2017.

[3] Y. Feng, Q. Wu, K. Okamoto, J. Yang, S. Takahashi, Y. Ejima, and J. Wu. “A basic studyon regular polygons recognition of central and peripheral vision field for virtual real-ity”. In: 2017 IEEE International Conference on Mechatronics and Automation (ICMA) (Aug.2017), pp. 1738–1743. ISSN: 2152-744X.

[4] Grigore C. Burdea. Virtual Reality Technology. 2nd ed. New Jerey: Wiley, 2003. ISBN:0471360899.

[5] D. Alonso-Ríos, A. Vázquez-García, E. Mosqueira-Rey, and V. Moret-Bonillo. Usabil-ity: A Critical Analysis and a Taxonomy. Department of Computer Science. University of ACoruña, Spain, 2010.

[6] Leap Motion inc. Leap Motion. URL: https : / / developer . leapmotion . com /documentation/ (visited on 02/06/2019).

[7] Daniel Bachmann, Frank Weichert, and Gerhard Rinkenauer. “Review of Three-Dimensional Human-Computer Interaction with Focus on the Leap Motion Con-troller”. In: Sensors 18.7 (2018).

[8] Christianne Falcao, Ana Catarina Lemos, and Marcelo Soares. “Evaluation of NaturalUser Interface: A Usability Study Based on the Leap Motion Device.” In: Procedia Man-ufacturing 3.6th International Conference on Applied Human Factors and Ergonomics(AHFE 2015) and the Affiliated Conferences, AHFE 2015 (2015), pp. 5490–5495. ISSN:2351-9789.

[9] Vijayakumar Nanjappan, Hai-Ning Liang, Feiyu Lu, Konstantinos Papangelis, YongYue, and Ka Lok Man. “User-elicited dual-hand interactions for manipulating 3D ob-jects in virtual reality environments”. In: Human-centric Computing and Information Sci-ences 8.1 (Oct. 2018). ISSN: 2192-1962.

30

https://www2.deloitte.com/xe/en/pages/about-deloitte/articles/treading-water/the-very-real-growth-of-virtual-reality.htmlhttps://www2.deloitte.com/xe/en/pages/about-deloitte/articles/treading-water/the-very-real-growth-of-virtual-reality.htmlhttps://www2.deloitte.com/xe/en/pages/about-deloitte/articles/treading-water/the-very-real-growth-of-virtual-reality.htmlhttps://developer.leapmotion.com/documentation/https://developer.leapmotion.com/documentation/

Bibliography

[10] Waraporn Viyanon and Setta Sasananan. “Usability and performance of the leap mo-tion controller and oculus rift for interior decoration”. In: 2018 International Conferenceon Information and Computer Technologies (ICICT) (2018), p. 47. ISSN: 978-1-5386-5382-1.

[11] Spyros Vosinakis and Panayiotis Koutsabasis. “Evaluation of visual feedback tech-niques for virtual grasping with bare hands using Leap Motion and Oculus Rift”. In:Virtual Reality 22.1 (Mar. 2018), pp. 47–62. ISSN: 1434-9957.

[12] Najmeh Ghasemifard, Mahboubeh Shamsi, Abol Reza Rasouli Kenar, and Vahid Ah-madi. “A New View at Usability Test Methods of Interfaces for Human Computer In-teraction”. In: Global Journal of Computer Science and Technology 15 (2015).

[13] Daniel Bachmann, Frank Weichert, and Gerhard Rinkenauer. “Evaluation of the LeapMotion Controller as a New Contact-Free Pointing Device.” In: Sensors, Vol 15, Iss 1, Pp214-233 (2014) 1 (2014), p. 214. ISSN: 1424-8220.

[14] John Brooke. “SUS: A Quick and Dirty Usability Scale.” In: (1996). URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.675.3169.

[15] Bettina Laugwitz, Theo Held, and Martin Schrepp. “Construction and Evaluation ofa User Experience Questionnaire”. In: USAB 2008 5298 (Nov. 2008), pp. 63–76. DOI:10.1007/978-3-540-89350-9_6.

[16] Robert S. Kennedy, Norman E. Lane, Kevin S. Berbaum, and Michael G. Lilienthal.“Simulator Sickness Questionnaire: An Enhanced Method for Quantifying Simula-tor Sickness.” In: International Journal of Aviation Psychology 3.3 (1993), p. 203. ISSN:10508414.

[17] Per-Anders Oskarsson and Staffan Nählinder. “Evaluation of Symptoms and Effects ofVirtual Reality Based Flight Simulation and Enhanced Sensitivity of Postural StabilityMeasueres.” In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting50.26 (2006), p. 2683. ISSN: 10711813.

[18] Jeffrey Rubin and Dana Chisnell. Handbook of usability testing. how to plan, design, andconduct effective tests. Indianapolis, Ind. : Wiley ; Chichester : John Wiley [distributor],c2008, 2008. ISBN: 0470386088.

[19] Aaron Bangor, Philip T. Kortum, and James T. Miller. “An Empirical Evaluation ofthe System Usability Scale.” In: International Journal of Human-Computer I

Evaluation of solutions for a vir- tual reality cockpit in ...1324173/FULLTEXT01.pdf1.2. Aim This...

Documents

Transcript of Evaluation of solutions for a vir- tual reality cockpit in ...1324173/FULLTEXT01.pdf1.2. Aim This...