Results of the user-centred quality evaluation...

47
Results of the user-centred quality evaluation experiments Satu Jumisko-Pyykkö n Timo Utriainen

Transcript of Results of the user-centred quality evaluation...

Page 1: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

Results of the user-centred quality evaluation experiments Satu Jumisko-Pyykkö n Timo Utriainen

Page 2: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV

Project No. 216503

Results of the user-centred quality evaluation experiments

Satu Jumisko-Pyykkö, Timo Utriainen

Abstract: This study had two main goals. Firstly, we explored the impact of different evaluation contexts

on users‟ experienced quality. Secondly, we studied the impact of audio and video coding parameters on experienced overall 3D quality. We conducted two extensive quality evaluation experiments in four different contexts of use with several relatively low audio-video coding parameters (bitrate, framerate, presentation mode) and content types. The experiments were conducted on a portable device using barallax barrier display technology and stereovideo coding with each view independently encoded. Our results based on quantitative evaluations showed some differences in the contextual quality requirements. Our initial analysis indicate that the current mobile television and near future bitrates (320kbps and 768kbps) provided equally pleasant viewing experience independently on used framerates in 3D mode. The significant increase in the bitrate-framerate resources did not either improve the perceived quality or reach the most pleasantly rated level of 2D quality. These results might strongly display dependant. The further analysis needs to complement these results with the analysis of content dependant differences, qualitative quality impressions and gaze patterns to understand the quality and contextual differences.

Keywords: mobile 3DTV, bitrate, framerate, context of use, subjective quality, experienced quality

Page 3: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 2 of 45

Executive Summary

Subjective quality evaluation experiments are conducted for optimizing critical system

components during the process of system development. Conventionally, the experiments take

place in controlled viewing conditions even though the target application is meant to be used in

heterogeneous mobile settings. This reported study has two goals. The first goal is to explore

how the quality is experienced in different contexts of use. The second goal is to investigate the

impact of audio and video coding parameters on experienced overall 3D quality. We developed

further a hybrid User-Centered Quality of Experience (UC-QoE) evaluation method for especially

capturing the dynamic characteristics in the contexts of use while evaluating the quality and for

drawing deeper understanding on it. We conducted two extensive subjective quality evaluation

experiments by applying this method.

We conducted the first experiment in two different evaluation contexts (laboratory, home-like)

and varied video bitrate, framerate and contents. The second experiment took place in three

different evaluation contexts (laboratory, bus-travel, waiting in station) with varied audio and

video bitrates and contents. The level of constructed quality represented relatively low quality

range for current and future data rates for mobile 3D television. To go beyond the assumption of

superiority of 3D on mobile device we also compared this to the actual 2D quality used in mobile

television. The experiments were conducted on portable device using parallax barrier display

technology and simulcast stereo video encoding.

The results of these studies are reported in two phases. In the current document, the

quantitative experimental results are described. Our results showed contextual differences in the

experienced quality. The results of the first experiment showed the controlled laboratory settings,

which offered more pleasant viewing conditions compared to the home-like context. There were

no significant differences between these contexts in the terms of task load index. In the second

experiment, the quality was evaluated to be at the same level in the laboratory and station

contexts while bus context offered slightly lower quality. Bus contexts showed also a higher level

of time pressures of task load compared to the other contexts. Further steps of analysis needs to

complement the current findings with the results of user‟s attention while viewing, mapping with

detailed contextual characteristics and qualitative impressions and experiences from these

contexts.

Our results showed some influences of compared parameter combinations on quality. In the 3D

mode, the current mobile television and near future bitrates (320kbps and 768kbps) were equally

rated independently on used framerates. The initial analysis revealed that the significant

increase in the bitrate-framerate resources did not either improve the perceived quality or reach

the most pleasantly rated 2D quality. This result might be strongly influenced by the used display

technology and therefore we recommend replicating the test with different displays in the further

studies. Further steps of analysis needs to target on the content dependant preferences in more

detail, explore the qualitative impressions and experiences of constructed quality to support and

draw deeper understanding on the quantitative results.

Page 4: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 3 of 45

Table of Contents

1 Introduction .......................................................................................................................... 4

2 Research method ................................................................................................................. 6

2.1 Experiment 1 ................................................................................................................. 6

2.1.1 Participants ............................................................................................................ 6

2.1.2 Test Procedure ....................................................................................................... 6

2.1.3 Context of viewing .................................................................................................. 8

2.1.4 Stimuli material ....................................................................................................... 9

2.1.5 Production of Test Materials ................................................................................. 12

2.1.6 Presentation of Test Materials .............................................................................. 14

2.1.7 Data-collection tools ............................................................................................. 15

2.1.8 Method of analysis ............................................................................................... 16

2.2 Experiment 2 ............................................................................................................... 16

2.2.1 Participants .......................................................................................................... 16

2.2.2 Test procedure ..................................................................................................... 16

2.2.3 Contexts of viewing .............................................................................................. 17

2.2.4 Stimuli material ..................................................................................................... 19

2.2.5 Production of test material .................................................................................... 19

2.2.6 Presentation of test material ................................................................................. 19

2.2.7 Data-collection tools ............................................................................................. 19

2.2.8 Method of analysis ............................................................................................... 21

3 Results ............................................................................................................................... 22

3.1 Experiment 1 ............................................................................................................... 22

3.1.1 Relation between acceptance and satisfaction measures .................................... 22

3.1.2 Comparisons between contexts ........................................................................... 22

3.1.3 Comparisons between parameters – averaged across contents .......................... 24

3.2 Experiment 2 ............................................................................................................... 26

3.2.1 Relation between acceptance and satisfaction measures .................................... 26

3.2.2 Comparisons between contexts ........................................................................... 26

3.2.3 Comparisons between audio-video bitrates – averaged across contents.............. 28

4 Discussion and conclusions ............................................................................................... 31

5 Acknowledgements ............................................................................................................ 33

Page 5: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 4 of 45

1 Introduction Conventional quality evaluation experiments are conducted in highly controlled laboratory circumstances. These controlled measures represent an artificial setting, enable accurate control of variables and replicable experiments, but suffer from limited and unknown level of realism (Wynekoop & Russo 1997). However, in many cases the target applications of ubiquitous computing, such as mobile (3D) TV are designed to be used in multiple, heterogeneous and dynamic circumstances (Jumisko-Pyykkö et al. 2008, Jumisko-Pyykkö & Vainio submitted). From the viewpoint of design it is desirable to know whether the system under the study fulfil the user‟s requirements in its actual contexts of use in the development phase as early as possible.

There are only a few quality evaluation experiments conducted in the field settings (overview given in Jumisko-Pyykkö & Strohmeier 2008). Jumisko-Pyykkö & Hannuksela (2008) have compared controlled and three field settings for mobile television by varying the audiovisual contents and residual transmission error rates. The experiment was conducted in different physical contexts with related task (Bus – travel to library, Cafe – relax, Railway station – wait for a friend). Their results showed that in the actual context of use the evaluations were more favourable and less discriminative. In other words, the quality requirements drawn from actual context of use can be lower than from the controlled context. In contrast, Knoche & Sasse (2008) have conducted a comparison between controlled and field (underground) settings by varying image resolution with different content and shot types. They concluded that in the field settings, the image size had high value for users than in the lab.

To sum up, 1) the number of studies targeting on quality evaluation in the field settings is very small in the field of quality evaluations, 2) the previous work has underlined the existing differences between the contexts of studies, but 3) they have not tried to understand the phenomena or factors beyond this difference, 4) finally, to our best knowledge there is no previous work on 3D quality on mobile devices conducted on field settings.

The first goal of this study is to explore how the quality is experienced in the different contexts of use. We define the context of use surround the human-computer interaction and contain components of task, physical, social, temporal and technical and informational contexts (Jumisko-Pyykkö & Vainio, submitted). We conduct two studies where the quality is measured in controlled laboratory and altogether in four different field circumstances. To understand the contextual factors that may influence these evaluations, we develop hybrid data-collection tool. It includes 1) real-time event and gaze-data recording, 2) expert ratings of contextual events 3) participant‟s descriptive impressions and experiences of context and 4) workload measurements for each context. Our work is a part of further development of UC-MQoE (User-Centered Multimedia Quality of Experience; Jumisko-Pyykkö & Strohmeier 2008, Gotchev et al. 2009).

The majority of the quality evaluation experiments targets the examination of influence of one medium at a time even though the final application may combine two or more media. From the viewpoint of multimodal perception, the different modalities can complement and modify the perceptual experience created by other perceptual channels, and multimodal experience is more than the simple sum of two different perceptual channels (Hands 2004, Beerends & Caluwe 1999, Cohen, 2001, McGurk & McDonald 1976, Shimojo & Sham 2001, Reeves & Nass). This can also be interpreted such that the quality or even the presence of other modality can influence the overall quality requirements. The previous work of audiovisual video quality for mobile devices provides a good example on this; even the work is still relatively limited in the topic (Winkler & Faller 2005, Ries et al. 2005, Jumisko-Pyykkö et al. 2005, Jumisko-Pyykkö & Häkkinen 2008, Knoche et al. 2005). However, there are no publications available about audiovisual 3D video quality for mobile devices.

Page 6: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 5 of 45

The second goal of this study is to investigate the impact of audio and video coding parameters on experienced overall 3D quality. In two experiments, we vary video bitrate and framerate, audio bitrate and visual presentation mode (2D/3D) on mobile device with several audiovisual contents. Quantitative overall quality is measured using bidimensional quality evaluation method (Jumisko-Pyykkö et al. 2008) combining the quality satisfaction to the acceptance threshold. Free-description task and stimuli-assisted description task is used to collect the participants‟ impressions and experienced of quality.

In this current version of the report, we describe the method used in both studies and the results of overall quantitative quality evaluation studies.

Page 7: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 6 of 45

2 Research method

2.1 Experiment 1 The first goal of the experiment was to compare the preferences of experienced quality between two different evaluation contexts. The second goal was to explore the influence of visual bitrate-framerate combinations on experienced quality. We conducted the experiment in two indoor contexts, varied three bitrates with two framerates, and presented the stimuli with 4 different contents on mobile device.

2.1.1 Participants

A total of 30 participants; equally stratified by gender and by age between 18 and 45 years participated to the experiment. The sample contained mostly (80%) naive or untrained participants and they did not belong to the group of innovators or early adopters (Rogers 2003). These participants had little to no prior experience of quality evaluation experiments, they were not experts in technical implementation and they were not studying, working or otherwise engaged in information technology or multimedia processing (ITU-R BT.500-11, ITU-T P.911). None of the participants had taken part in audiovisual evaluations within the last 12 months.

2.1.2 Test Procedure

The test procedure (Figure 1) was divided into pre-test, test and post-test sessions.

In the pre-test session sensorial tests and demographic data-collection took place. Visual acuity (Landolt chart 20/40), colour vision (Ishihara test), acuity of stereo vision were measured (Randot stereotest, 0.6) Demographic data collection targeted on broadly aspects of television consumption habits, use of different devices, previous 3D experiences and technology attitude.

At the end of the pre-test session, combined training and anchoring took place. Participants were shown the extremes of the sample qualities and all contents to familiarise them to quality scale, test content types and evaluation task (ITU-R BT 500-11). The whole pre-test session was held in the lab environment.

In the test, the bidimensional research method of acceptance was used (Jumisko-Pyykkö et al. 2008). The stimuli were presented one by one, they were rated independently and retrospectively by applying single stimulus/ absolute category rating method (ACR) (ITU-R BT. 500-11, ITU-T P.911 ) After each clip, participants marked retrospectively the overall quality satisfaction score using a discrete, unlabeled scale from 0 to 10 and the acceptance of quality for viewing mobile 3DTV (binary yes/no scale). To measure satisfaction, we used a wide scale to compromise the end-avoidance-effect and problems of labelled scales due to the cultural differences (Lawless & Heyman 1998). Acceptance of quality was measured on a binary scale (yes/no) to find a threshold for acceptable quality (Jumisko-Pyykkö et al. 2008). The instructions for the quality evaluation tasks were as follows. For gathering the quality satisfaction score, the participants were asked to assess the overall quality of the presented clip. The measure of acceptance of quality was evaluated by asking whether the participants would accept the overall quality presented if they were watching mobile 3D television. No other evaluation criteria or advice were given.

After completing the actual evaluation task participants filled two questionnaires, the Simulator sickness questionnaire, SSQ (Kennedy et al. 1993) and the task load index, NASA TLX (Hart & Staveland 1988). The actual test phase was conducted in the two contexts of use.

The post-test session gathered qualitative data on 1) experiences of mobile 3D television and 2) experiences of viewing contexts. Semi-structured interview gathered participants‟ impressions,

Page 8: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 7 of 45

experiences, interpretations and descriptions of quality (ADD. Jumisko-Pyykkö et al. 2007). At the beginning, the free-description task was conducted without supporting stimuli material. Participants were asked the main question: “What did you paid attention to while evaluating quality?”. Further supporting questions were used to examine the pleasantness or importance of described factors: “Did you experience X pleasant / unpleasant/ acceptable / significantly influencing?”. All the described factors were described by the participant and they were only replicated by the interviewer in order to understand the important quality factors. The free description task was followed by the stimuli-assisted description task. Participants were shown a stimulus and the same main and supporting questions were presented as in free-description task. A total of 9 clips were shown in random order. Similarly, the semi-structured interview was conducted for gathering the impressions and experiences of different viewing contexts. The first impressions were asked after each evaluation context and their comparisons were more broadly interviewed in the post-test session.

At the end of the session, participants also answered to a questionnaire that measured their attitudes towards each presented content, and to a short questionnaire about user‟ requirements for mobile 3D television (Strohmeier et al. 2008).

Figure 1 Test procedure of experiment 1. The different versions show the randomisation of the starting

context of use (laboratory/ home).

PRE-TEST:

Pre-test questionnaire

Vision tests

Anchoring and training

POST-TEST:

Post-test interview (part 1)

Stimuli-assisted description task

Post-test interview (part 2)

Post-test questionnaire

PRE-TEST:

Pre-test questionnaire

Vision tests

Anchoring and training

POST-TEST:

Post-test interview (part 1)

Stimuli-assisted description task

Post-test interview (part 2)

Post-test questionnaire

Laboratory Home

HomeLaboratory

OR

SSQ

NASA TLX

SSQ

NASA TLX

SSQ

NASA TLX

SSQ

NASA TLX

Page 9: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 8 of 45

2.1.3 Context of viewing

The experiment took place in two evaluation contexts: Laboratory and Home-like context. The conditions in the laboratory were fixed according to ITU-T P.911specifications (

Table 1; Figure 2). The viewing distance could not be fixed as suggested in ITU standards (ITU-T 1998), since preferred viewing distance of stereoscopic content is dependent on the user and the device. The viewing distance also varies according to the material, its production process and camera angles during recording. We used the initial viewing distance of 40cm, ~10 times the video height as suggested by Knoche et al. (Knoche et al. 2005, Knoche & Sasse 2008) however participants could change the distance if desired.

The home-like context was chosen as it represents the conventional viewing conditions for mobile television (Knoche & McCarthy 2005, O‟Hara et al. 2007) and is interesting for mobile 3DTV viewing (Jumisko-Pyykkö et al. 2008, Strohmeier et al. 2008). The home context (Figure 2) was set up in a nearby room with a sofa, an armchair, a bookshelf, a rug, some paintings and other paraphernalia. The room was lit with a single floor lamp with a dimmed 250W halogen light shedding its light towards the ceiling. A short description of context of use is given in Table 2 and Table 3 and detailed analysis of the contexts of use is given in APPENDIX 2.

In the laboratory context, the participant could adjust the seat and the height of the table where the device was placed on a stand perpendicular to the viewer‟s eye. In the home context, the participants were given a scenario in which they were planning on watching mobile 3D television to entertain themselves during a quiet Sunday evening. They were encouraged to select a place in the room they would feel natural in and make themselves comfortable. The device was given to them and they could sit and hold the device the way they felt most comfortable with.

Table 1 ITU (ITU-T P.911) specification (ITU-T 1998)

Parameter Specification Test setup

Viewing distance 1-8 H (Image height) 10 H (Image height)

Peak luminance of the screen 100-200 cd/m2 69 cd/m2 (Note 1)

Ratio of screens peak black to peak white luminance

≤ 0.1 0.014 (Note 1)

Background room illumination ≤ 20 lux Lab: 12 lux

Home: 26 lux

Background noise level ≤ 30 dBA 25 dBA (Note 2)

Listening level ~80 dBA 75 dBA (+10 dBA for peaks)

Note 1 – Peak luminance was measured displaying a black and white fullscreen image. Brightness setting was set to the maximum (as in the tests) and the parallax barrier of the device was on.

Note 2 – Some individual peaks of background noise were possible from the surrounding environment, such as adjacent rooms or the ventilation system.

Page 10: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 9 of 45

Figure 2 The images of physical laboratory and home contexts

Table 2 Laboratory context characteristics

Context name: Laboratory

Participant‟s task: -

Context characteristics: Quiet, fairly dark room. No distractions. Alone. No hurry. Test-like situation.

Table 3 Home context characteristics

Context name: Home

Participant‟s task: Watch programs to entertain yourself during a quiet Sunday afternoon

Context characteristics: A quiet, fairly dimmed room. No distractions. Alone. No hurry. Comfortable surroundings.

2.1.4 Stimuli material

Four types of content – documentary, animation, series and user-created material (see Table 4 and Table 5) – were selected for test clips according to their potential for mobile 3D television (Jumisko-Pyykkö et al. 2008, Strohmeier et al. 2008) and audiovisual characteristics. The choice of stereo content was a compromise, since high-quality reasonably long stereoscopic content for mobile devices is still rather scarce.

The length of the stimuli material was approximately 35 seconds (32–37 seconds). The chosen duration enabled users to have sufficient time to accommodate their vision and perception to the stereoscopic material. The use of shorter stimuli is recommended due to the limitations of human-working memory (Baddley 1998), but with the use of stereoscopic material, shorter stimuli would have been meaningless because of the time required for untrained participants to adapt their vision.

Each clip contained a meaningful segment of a supposed 3DTV program without cutting the start or end of a sentence, several shots with different distances and angles to be representative of future mobile 3D television content. The clips were selected so that they contained differing amounts of movement (temporal), details (spatial), amount of depth, the dynamism of depth and the amount of scene cuts. For audio they had music, sound effects, speech or a combination thereof. The characteristics of stimuli are presented in Table 5. These characteristics are based

Page 11: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 10 of 45

on mean values of the expert evaluations conducted by six assessors (Inter-rate agreement for all test material moderate Kappa =0,624, p<.001 (see the answering sheet in APPENDIX 1)

Table 4 Content descriptions

Image of content Name and content description

Knight’s Quest 4D

A knight attempts to rescue another knight. The knight walks to a trap and falls down a shaft, but manages to get to the other side safely using a grappling rope. The scene ends with the knight noticing he left his sword on the other side of the shaft.

Upper Middle Rhine Valley

The clip starts with a panning camera over a river valley. Next are several scenes of a vineyard with people gathering grapes. The clip ends with a man pouring gathered grapes into a container.

Liberation of Plzen

A commemorative parade of the liberation of Plzen. Scenes with marching troops and army vehicles.

Virtual Visit to Suomenlinna

Scenes from the Suomenlinna fortress. A boy and a woman walk in a scene with a tree in the foreground. In the next scene they run up a grassy hill.

Table 5 Content properties

Additional contents and evaluation task: At the end of each evaluation task one additional clip was presented (see Table 6 and Table 7) representing short (41-62 second) trailers from actual 3D movies. The purpose of these additional clips was to explore whether the results from

Genre Content Sp

ati

al

de

tails

Te

mp

ora

l d

eta

ils

De

pth

De

pth

dy

na

mis

m

Sc

en

e c

uts

Audio Le

ng

th [

s]

Animation Knight‟s Quest 4D High High Med High High Music, Effects 37

Documentary Upper Middle Rhine Valley Low Med High Low Low Music 34

User-created Liberation of Plzen Med Low Med High Low Music 37

Series Virtual Visit to Suomenlinna Low Low Med Med Low Speech, Music 32

Page 12: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 11 of 45

the short repeated stimuli material will be valid with longer unfamiliar content. The contents are unfamiliar for the viewers to make to viewing situation as similar as possible for normal content consumption. The past work has indicated that the evaluations might be more positive in this case (Jumisko-Pyykkö & Hannuksela 2008, McCarthy et al. 2004). One of these clips was played as the last clip in each context, but never showing the same clip to the same participant twice. The order of presented trailers was randomized. The results of this part of the tests are not reported in the current version of the report.

Table 6 Trailer descriptions

Image of content Name and content description

Bugs!

A trailer from a nature documentary starring several insects in close-up shots.

Meet the Robinsons

A trailer from a Disney movie with several animated characters.

SOS Planet

A trailer from an animated nature documentary. Several shots of animated scenes and animal characters.

Table 7 Trailer properties

Genre Content Sp

ati

al d

eta

ils

Tem

po

ral d

eta

ils

Dep

th

Dep

th d

yn

am

ism

Scen

e c

uts

Audio Len

gth

[s]

Documentary Bugs! High Low Low Low Med Speech, Music 57

Animation Meet the Robinsons High High High High High Speech, Music, Effects 41

Animation SOS Planet Med Med High High High Speech, Music, Effects 62

Page 13: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 12 of 45

Spatial detail

Temporal detail

Animation

Documentary

User-created

Series

Low High

Lo

wH

igh

Figure 3 Stimuli contents and their visual characteristics

Spatial detail

Temporal detail

Low High

Lo

wH

igh

Bugs

MTR

SOS

Figure 4 Trailer contents and their visual characteristics

2.1.5 Production of Test Materials

The selected test materials were encoded using recommended codecs for IP datacasting over DVB-H (EBU 2005, Faria et al. 2006), which are also used in Finland for current mobile TV broadcasting services (DVB-H 2008). Visual content was encoded using an AVC/H.264 encoder with the VGA (640x480) letterbox resolution of 640x360 (widescreen 16:9 ratio). For audio encoding, AAC-HEv2 (with PS+SBR) was used with a sampling rate of 44.1 kHz, 16 bits per sample, stereo.

The video codec used was Open Source x264 „Skystrife‟ build 1077 (x264 2009) and the audio codec was Nero AAC 1.3.3.0 (Nero 2009). The streams were encoded and muxed into .mp4-containers using meGUI 0.3.1.1010 (meGUI 2009). Sound clips were first resampled to 44.1 kHz (when necessary) and normalized using Audacity 1.2.6 (Audacity 2009) and then exported to

Page 14: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 13 of 45

stereo PCM .wav-files. The video clips were cut and resized using frameserving via Avisynth 2.5.7 scripting language (Avisynth 2009) and loaded directly without intermediate processing into meGUI for encoding and muxing with audio streams. Using Avisynth frame serving allowed us to produce clips while maintaining best possible quality without intermediate steps that might hinder the produced quality. An example of such a script can be found in APPENDIX 5.

For phase 1, video bitrates of 160, 320 and 768 kbps, and a frame rate of 10 and 15 frames per second were chosen (Table 5, below). The changes in frames per seconds were done with deleting frames according to the desired ratio via Avisynth. The codec was set to use the baseline profile of AVC/H.264 for mobile use. This profile does not support all the advanced compression features, but provides additional leeway for battery power savings. Audio bitrate was fixed at 48 kbps, which received excellent scores in EBU tests in terms of quality (EBU 2003).

The parameters were chosen according to previous tests on standard mobile TV, where typical bitrates varied between 100kbps and 500kbps and frame rates varied between 6 and 25 frames per second (Faria et al. 2006, Knoche et al. 2005, Gulliver & Ghinea 2006, McCarthy et al. 2004, Winkler & Faller 2006, EBU 2003, Jumisko-Pyykkö & Häkkinen 2006, Wijesekera et al. 1999). Current implementations of conventional 2D mobile TV broadcasting vary between 200kbps and 1000kbps per channel (DVB-H 2008). Previous experience of bandwidth needed to encode the additional third dimension suggests that some additional bandwidth was needed (Tikanmäki et al. 2008). 320kbps comes from basic-speed 384kbps 3rd generation (3G) cell phone networks (EBU 2005) taking into account the additionally needed encapsulation overhead and leaving sufficient space for audio. It is also a typical current day mobile television bandwidth and a quite similar setup is used for example in Vietnam (DVB-H 2008). 160kbps was selected to be a clearly low bandwidth scenario with moderately easily detectable encoding artifacts with most of the chosen contents. 768kbps was selected as a high case scenario with sufficient difference to the lower bitrates. All bitrates are for the entire video, i.e. including both video channels with 3D content.

In addition to the tested core bitrate-framerate combinations, two different types of hidden anchors were used in the tests. The 3D extreme parameter combination (1536 kbps, 24 fps) was chosen solely to see how good the experienced quality can be in high quality on-demand services or when content is purchased separately on removable media – i.e. in situations where bandwidth is less of an issue than in wireless channels based on broadcasting. All trailer clips were encoded with the highest parameter set of 1536kbps and 24fps in 3D only.

We wanted to compare quality of conventional 2D mobile television broadcasts to 3D stereo-video broadcasts. 2D was encoded so that it received the same total bitrate as the two 3D channels combined, as this is the way conventional 2D broadcasts would be done. The selection of 2D presentation cases (160kbps - 15fps; 320 - 15fps) represents the acceptable mobile television quality for the viewers (Jumisko-Pyykkö et al. 2008, Jumisko-Pyykkö 2008).

Page 15: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 14 of 45

Table 8 Patameters of the experiment 1

Video bitrate (kbps)

Framerate (Hz) Audio bitrate (kbps)

Video mode

160 10 48 3D

160 15 48 3D

160 15 48 2D

320 10 48 3D

320 15 48 3D

320 15 48 2D

768 10 48 3D

768 15 48 3D

1536 24 48 3D

The stated bitrates do not contain encapsulation and transmission overhead. The resulting bitrates in the clips differed somewhat from the ones stated above, as the encoder created files with varying total bitrates. The video codec did perform quite consistently using ABR (average bitrate) implementation, as the bitrate for the highest parameter varied between 1424 and 1573 kbps – a maximum of 7% difference from the requested bitrate. Audio bitrate was very accurate with only a maximum of 1 kbps (2%) difference to the requested bitrate.

2.1.6 Presentation of Test Materials

The test materials were presented on a prototype mobile 3DTV set (Figure 5). The prototype had a 4.3 inch (95mm x 60mm) touch-enabled transmissive autostereoscopic LCD screen with a native resolution of 800x480 pixels (WVGA, 200 PPI). The display has 2 views which are interleaved at pixel level. The screen is based on parallax barrier technology which can be turned on and off and is landscape/portrait switchable. In our tests the parallax barrier was always set on and the landscape mode was used. The device uses side-by-side format videos, where two video channels (one for each eye) are horizontally squeezed to half their normal width and then placed side-by-side. The parallax barrier then divides the views by blocking the light from certain angles so that each eye only sees the channel intended for it. The main drawbacks of parallax barrier are that by blocking part of the light it also lowers the screen brightness and that the viewer has to be positioned in a predefined sweet-spot to experience the 3D effect.

Page 16: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 15 of 45

Figure 5 The mobile 3D television prototype used in the tests

For audio playback, headphones were used and the starting level of audio loudness was adjusted to 75 dBA (+10 dBA for peaks) for unbiased starting conditions. In both evaluation contexts, the listening level was adjusted using equipment that simulates outer auditory channel of a human ear and 1 kHz sine input. Audio was listened on a pair of Sony MDR-E818LP stereo ear buds.

The prototype device had a stereoscopic screen with the native resolution of 800x480, unrealistically high for near-future DVB-H implementations, so the test materials were presented as centred on the screen with the VGA letterbox resolution of 640x360. The screen measured 95mm x 60mm (4.3 inch diameter), while the video stretched 75mm x 42mm (3.3 inch diameter). Clips were always played in full-screen mode leaving black borders all around the picture.

The test materials were presented individually in randomized order without presenting clips from the same content consecutively. The clips were played directly from the memory module of the device and were therefore not susceptible to transmission errors. The media player used was open source VLC player (version 0.8.6a for 3D clips and 0.8.6i for 2D clips) (VLC 2009). Parallax barrier was left always on by having different video output settings on each of the players so that together they displayed both 3D and 2D content correctly. This was important, since participants could have easily noticed the increase in display brightness when parallax barrier was switched off and this has been reported problem (Strohmeier 2007). The switches of the player were made so that there were also intentionally false switches (i.e. quits and starts of a new player window without actually changing the viewing mode). The locations of these switches were randomized.

The order of contexts was randomized between test sessions. Each test session consisted of tests in both of the contexts. In each context, participants watched a total of 39 video clips with the first 2 being dummy clips and the last one being a trailer. One test session lasted in total for about 2 hours. The tests were done during a three week period.

2.1.7 Data-collection tools

The test sessions were recorded with a night vision enabled digital video camera facing the user for recording participants‟ attention data to later compare them with phase 2 field experiments. The interviews were audio-recorded.

Page 17: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 16 of 45

2.1.8 Method of analysis

Satisfaction data were analyzed using Friedman‟s test and Wilcoxon matched pair signed rank test because the presumption of parametric methods (normality) was not met (Kolmogorov-Smirnow: p<0.05). Friedman‟s test is applicable to measure differences between several and Wilcoxon‟s test between two related and ordinal data sets. McNemar‟s test was applied for the nominal retrospective acceptance evaluations to test the differences between two categories in the related data. (Coolican 2004.)

Chi-square test of independence evaluated independence between distributions of two variables measured on a categorical scale and it was used to relate the acceptance and satisfaction ratings with each other (Coolican 2004).

Repeated measures of ANOVA can be used to compare differences between three or more conditions for interval data in related design and was used e.g. in the comparisons of NASA TXL data (Coolican 2004).

2.2 Experiment 2 The goal of the experiment was to examine the quality requirements in three different evaluation contexts. We also examined the impact of audio-video bitrates on experienced quality in these three contexts and compared.

2.2.1 Participants

Participants were recruited according to the same requirements as in the experiment 1. Participants sensitive to motion-sickness were left out of the sample in the recruiting phase.

2.2.2 Test procedure

The test procedure contained three parts; a pre-test, test and post-test sessions. In the pre-test session the quality anchoring and training took place to rehearse the participants to the quality range, contents and evaluation task (ITU-R BT 500-11). The anchoring was always conducted in the first evaluation context.

Each participant started either at the train station or at the laboratory and travelled via bus to the other context (Figure 6). Full randomization of the starting contexts was not possible due to the time constrains of the experiments (max 2h).

The actual evaluation task applied the bidimensional research method of acceptance, similar to experiment 1 (Jumisko-Pyykkö et al. 2008). NASA TLX was filled after each context. After each context we also conducted a short mid-test interview which gathered impressions and experiences of the previous context, its relevance to their life and imagined use of device in such a situation. These interviews were held while moving between contexts. In each context, participants watched a total of 26 video clips, with the first one being a dummy clip and the last one being a trailer.

Sensorial tests were done in the laboratory environment for all participants in the same way as in phase 1.

The post-test session contained free description-task about the impressions of quality and contexts, similar to experiment 1. Stimuli-assisted description task was not conducted due to time constraints. In addition to surveys about the relation to content and user requirements the SOP questionnaire (Childers et al. 1985) for measuring participants‟ visual versus verbal information processing was also included in the post-test session.

Page 18: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 17 of 45

Figure 6 Phase 2 test procedure. The different variations show the quasi-randomisation of the starting context

of use.

2.2.3 Contexts of viewing

Three different viewing contexts were selected; a conventional controlled laboratory context, bus-travelling and railway station – waiting contexts. These natural contexts were chosen as they have been found to be popular contexts in previous studies relating to mobile 3DTV (Strohmeier et al. 2008) and mobile TV (Knoche & McCarthy 2005, O‟Hara et al. 2007). The train station is a common waiting situation in which people spend their idle time while waiting for their train to leave. The bus is a typical scenario in which people would like to kill time by watching television while commuting. The detailed analysis the characteristics of contexts are given in APPENDIX 2 and the light version can be found below.

Bus

PRE-TEST:

Pre-test questionnaire

Vision tests

Anchoring and training

POST-TEST:

Post-test interview

Post-test questionnaire

SOP

PRE-TEST:

Pre-test questionnaire

Anchoring and training

POST-TEST:

Post-test interview

Post-test questionnaire

SOP

NASA TLX

Mid-test interview

NASA TLX

Mid-test interview

Laboratory Station

Bus StationLaboratory

OR

NASA TLX

Mid-test interview

NASA TLX

Mid-test interview

Vision testsVision tests

Page 19: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 18 of 45

Figure 7 The station context, Figure 8 The physical bus context

In the bus context, participants were allowed to select their seat and the moderator sat next to them (Figure 8). The participants had an extra task which was to keep an eye out for a particular bus stop at which to get off the bus.

Table 9 Bus context characteristics

Context name: Bus

Participant‟s task: Watch programs while commuting to the train station to catch the train (or to the Hervanta suburb to meet a friend)

Context characteristics: Moving vehicle with acceleration and braking, varying light and noise levels, possibly cold (tests were done during the winter), distractions to navigate and orientate, plenty of other passengers, typically no interaction, possible social pressure.

At the train station, participants sat at a café table in the main hall of the train station (Figure 7). As a parallel task, participants were given a scenario in which they were waiting for their train to leave and they had to keep an eye out for the timetable as to when to leave for the platform to catch the train.

Table 10 Station context characteristics

Context name: Station

Participant‟s task: Watch programs while waiting for the train

Context characteristics: Varying noise levels, distractions to check the timetable, plenty of other people, typically no interaction, possible social pressure.

Page 20: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 19 of 45

2.2.4 Stimuli material

The same test materials were used as in phase 1.

2.2.5 Production of test material

For phase 2, the video bitrates selected were 320 and 768 kbps. This time, audio bitrates varied between 18 and 48 kbps, while frame rate was fixed at 15 fps (Table 11, below). 15 fps was chosen since according to phase 1, it produced larger differences in satisfaction than 10 fps between bitrates. It is also common in current mobile television implementations (DVB-H 2008).

48kbps audio bitrate was chosen because it‟s common in present day mobile television applications (DVB-H 2008) and has been tested as sufficient (EBU 2003). The lower 18kbps audio bitrate was chosen because it was deemed low enough to differentiate clearly from the higher bitrate, but still comprehensible on selected contents.

Trailer clips were encoded with the 320kbps bitrate for video and 48kbps bitrate for audio 3D parameter set.

As in phase 1, the same Avisynth scripts were loaded directly into MeGUI via frameserving. The codecs and production process used were the same as in phase 1.

Table 11 Phase 2 parameters for quality evaluation

Video bitrate (kbps)

Framerate (Hz) Audio bitrate (kbps)

Video mode

320 15 18 3D

320 15 48 3D

320 15 18 2D

320 15 48 2D

768 15 18 3D

768 15 48 3D

2.2.6 Presentation of test material

The devices, players and headphones were the same as in experiment 1. The clips were played in a randomized order (ITU-R BT 500-11). The test lasted for about 2 hours in total. The tests were done during a three-week period.

The built-in battery of the prototype 3DTV device broke down halfway through the experiment 2. After 15 successful tests, we had to abandon doing further tests in the bus context as there were no back-up solutions available within the time available to solve the battery problem. The last 15 tests were carried out with just the laboratory and the station contexts where power outlets were available.

2.2.7 Data-collection tools

The laboratory tests were done the same way as in phase 1 and were recorded with a night vision enabled digital video camera facing the user for recording participants‟ attention data as in phase 1.

Page 21: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 20 of 45

The field experiments (i.e. bus and station contexts) were documented using a mobile camera system (Figure 8 and Figure 9). We aimed at making the system as unobtrusive as possible to the participant and as invisible as possible to other people. The participants carried a case originally intended for carrying a notebook which housed a video hub capable of combining up to four different video streams and recording them on the fly to a removable memory card, and batteries for the hub. The setup consisted of three cameras, one recording the participant‟s field of view via a necklace camera, one observing the participants face and eyes and one for recording the events on the prototypes screen. The necklace camera installation also housed a microphone. These videos were needed for capturing vital nature of context and for understanding the actual contextual characteristics.

The moderator conducted the structural observation during the experiment. He/she filled an answering sheet that measured the context characteristics in each context of use while conducting the experiment. The sheet was based on mobile context components and properties presented based on Jumisko-Pyykkö & Vainio (submitted). The version used in the context contained mainly questions about physical, social, task and temporal contexts (APPENDIX 2). These measures included for example lighting and noise levels, viewing distance, shiftings of the device and social, technical and physical interruptions during the test.

The interviews were audio-recorded.

Figure 9 A snapshot of the combined video produced by the mobile camera system

Page 22: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 21 of 45

2.2.8 Method of analysis

The used method of analysis was similar to the experiment 1.

Page 23: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 22 of 45

3 Results

3.1 Experiment 1

3.1.1 Relation between acceptance and satisfaction measures

At first, we identify the acceptance threshold on satisfaction scale and we rely on satisfaction ratings in the detailed comparisons (Jumisko-Pyykkö et al. 2008). Acceptable quality was connected to scores between 5.8 and 9.1 (Mean=7.4, SD=1.68) on a satisfaction scale and unacceptable quality to scores between scores of 1.5-5.6 (Mean=3.6, SD=2.05). The distributions between retrospectively rated satisfaction and acceptance differed (χ² (10)=278,1, p<.001).

Figure 10 Relation of acceptance ratings on mean satisfaction scale

3.1.2 Comparisons between contexts

In overall, quality was experienced being slightly higher in controlled laboratory than in home-like context. This result appeared with both acceptance (McNemar‟s test: p<.01) and satisfaction ratings (Wilcoxon: Z=-4.73, p<0.001) when averaged over all other factors. Worth of mentioning is also that presented quality fell into category of unacceptable quality.

Page 24: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 23 of 45

Figure 11a) Acceptance b) satisfaction ratings for different evaluation contexts when averaging across

contents and parameters.

Overall workload is the average of the six TLX factors called mental demand, physical demand, time pressure, effort, frustration, and performance. Evaluation contexts did neither have significant impact on overall workload (t=-0,545, df=29, p=0,590, ns; Figure 12) nor on any of its factors (p>.05).

Figure 12a) Mean overall workload index when averaged over its factors, b) Mean workload index factor by

factor for different evaluation contexts.

Page 25: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 24 of 45

3.1.3 Comparisons between parameters – averaged across contents

Relation of all parameter combinations to satisfaction scores are presented in Figure 13 and content-by-content in Figure 14. The main results are summarised in this chapter while detailed pair- wise comparisons of all parameters are described in APPENDIX 3.

Figure 13 Mean satisfaction score for each context averaged over all contents. The bars show 05% CI.

Laboratory context:

Bitrates - The bitrates had significant impact on satisfaction scores (Friedman: FR = 43.90, df = 3, p < .001).The two highest bitrates (758kbps and 320kbps) provided equally the most satisfying quality (Wilcoxon: p>.05) over the lowest bitrate (Wilcoxon: 160kbps; p<.001). This result appeared independently on used framerate.

Framerate - The framerates had significant impact on satisfaction scores (Friedman: FR = 49.45, df = 2, p < .001). The framerates were equally evaluated in the high bitrates (Wilcoxon: p>.05). In the lowest bitrate, (160kbps) the 10fps was preferred over 15fps (Wilcoxon: p<.05).

Comparison to the max case - The comparison to the maximum studied bitrate-framerate parameter combination reveals that the increase of these resources would not make the quality satisfaction better. Maximum case was equally rated with 768kbps/10fps, 15fps and 320kbps/10fps (Wilcoxon: p<.05) parameter combinations and outperformed the other 3D parameter combinations (Wilcoxon: p<.05).

2D vs 3D - Among the all tested parameter combinations, the most satisfying quality was presented with 2D presentation mode (difference to others p<.001). The higher 2D bitrate (320kbps) outperformed the lower bitrate (160kbps) (Wilcoxon: p<.001). These two parameter combinations were the only to provide acceptable quality.

Page 26: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 25 of 45

Home context:

Bitrates - The bitrates had significant impact on satisfaction scores (Friedman: FR = 32.78, df = 3, p < .001). The highest bitrate (758kbps, 15fps) provided the most satisfying quality over the other bitrates (Wilcoxon: p<.05).

Framerate - The framerates had also significant impact on satisfaction scores in home context (Friedman: FR = 50.75, df = 2, p < .001). The higher framerate (15fps) was preferred over lower framerate (10fps) in the higher bitrates (320kbps and 768kbps; Wilcoxon: p<.05). There is no significant difference between the framerates in the lowest bitrate (160kbps: Wilcoxon: p<.05).

Comparison to the max case - The maximum studied bitrate framerate combination was equally rated with 768kbps/10fps, 15fps and 320kbps/15fps (Wilcoxon: p<.05) parameter combinations outperforming all the other 3D parameter combinations (Wilcoxon: p<.05).

2D vs 3D - Similarly to the laboratory context, the most satisfying quality was presented with 2D presentation mode (difference to others p<.001). The higher 2D bitrate (320kbps) outperformed the lower bitrate (160kbps) (Wilcoxon: p<.001). These two parameter combinations were the only to provide acceptable quality also in the home context.

Figure 14 Mean satisfaction score for each contents and contexts. The bars show 95% CI.

Page 27: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 26 of 45

3.2 Experiment 2

3.2.1 Relation between acceptance and satisfaction measures

The acceptance threshold was identified on scale. The satisfaction scores between 6.2 and 9.1 (Mean=7.6, SD=1.46) were experienced as acceptable while unacceptable quality was attached to scores between 2.5-5.8 (Mean=4.2, SD=1.65). The distributions between retrospectively rated satisfaction and acceptance differed (χ² (10)=647.0, p<.001).

Figure 15 Relation of acceptance ratings on mean satisfaction scale

3.2.2 Comparisons between contexts

The evaluation contexts impacted on quality evaluations when averaged over the other factors (FR = 8.90, df = 2, p < .012). Quality was experienced equally high between laboratory and railway station contexts (p<.05) while for the bus context the quality was experienced slightly lower (p<.05). In terms of acceptance, slightly higher quality was experienced in the station than in the lab contexts (p<.001).

Page 28: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 27 of 45

Figure 16a) a) Acceptance b) satisfaction ratings for different evaluation contexts when averaging across

contents and parameters.

Overall workload is the average of the six TLX factors (mental demand, physical demand, time pressure, effort, frustration, and performance). Evaluation contexts did not have significant impact on overall workload (F(2,27)=1,454, p=0,232, ns). Mainly there were not the contextual impact on different TXL factors (p>.05), but the only exception was made in the case of time pressure. In the bus context, the time pressure was experienced as a significantly higher than in the lab context (t=-2,92, df=14, p<.05).

Figure 17 a) Mean overall workload index when averaged over its factors, b) Mean workload index factor by

factor for different evaluation contexts.

Page 29: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 28 of 45

3.2.3 Comparisons between audio-video bitrates – averaged across contents

Relations of all the parameter combinations to satisfaction scores are presented Figure 18 and content by content in Figure 19. The main results are summarised in this chapter while detailed pair wise comparisons of all parameters are described in APPENDIX 4.

Figure 18 Mean satisfaction score for each context averaged over all contents. The error bars show 95% CI.

Laboratory context:

Video bitrate – The studied video bitrates impacted on quality satisfaction scores (Friedman: FR = 7.34, df = 1, p < .01), but the detailed pair-wise comparisons did not reveal any significant differences.

Audio bitrate – The studied audio bitrates did not have impact on quality satisfaction scores in 3D presentation mode (Friedman: FR = 0.24, df = 1, p > .05). Higher audio bitrate gave the most satisfying quality in the 2D presentation mode (Wilcoxon: p<.05).

Presentation mode – 2D presentation mode was experienced giving the most satisfying quality (p<.001). The combination video bitrate with 320kbps with audio bitrate 48kbps gave the most satisfying audiovisual experience (Wilcoxon: difference to 320kbps, 18kbps, 2D p<.05). The result can be interpreted that in the audio and visually controlled noise free environment both high audio quality and visual quality contribute to the pleasant audiovisual experience.

Page 30: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 29 of 45

Bus context:

The evaluations of the bus context are given by 15 participants while the other contexts contain evaluations of 30 participants.

Video bitrate – The studied video bitrates did not have impact on quality satisfaction scores in any of the presentation modes (Friedman: FR = 1.36, df = 1, p > .05).

Audio bitrate – The studied audio bitrates did not have impact on quality satisfaction scores (Friedman: FR = 0.01, df = 1, p > .05; Wilcoxon: p>.05).

Presentation mode – 2D presentation mode produced the most satisfying experienced quality among the all tested parameters (Wilcoxon: p>.05). Both 3D variables were equally rated and these evaluations differed significantly from all other evaluations (Wilcoxon: p>.001).

Station context

Video bitrate – The studied video bitrates had a significant impact on quality satisfaction scores (Friedman: FR = 6.17, df = 1, p < .05). Higher video bitrate (768kbps) were experienced higher than lower bitrate (320kbps) when high audio bitrate was accompanied with it (Wilcoxon: p<.05) while all other parameter combinations did not produce significant difference in quality (Wilcoxon: p>.05)

Audio bitrate – The studied audio bitrates did not have a significant impact on quality satisfaction scores (Friedman: FR = 0.84, df = 1, p > .05). The audio bitrates were equally evaluated among 3D presentation mode (Wilcoxon: p>.05) while in 3D lower bitrate (18kbps) was preferred over 48kbps.

Presentation mode – 2D presentation mode produced the most satisfying experienced quality among the all tested parameters (Wilcoxon: p>.05). The lower audio bitrate was evaluated to give slightly better quality in 2D presentation mode (Wilcoxon: p<.05).

Page 31: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 30 of 45

Figure 19 Mean satisfaction score for each content and context. The error bars show 95% CI.

Page 32: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 31 of 45

4 Discussion and conclusions Aim of this study was two-folded. First, we examined how the quality is experienced in the different contexts of use. Secondly, we investigated the impact of audio and video coding parameters on experienced overall quality. In this report, we have described the further steps of development of a hybrid User-Centered Quality of Experience (UC-QoE) evaluation method for capturing the dynamic nature of the context of use. We used the described method for conducting two experiments. In this report, we describe only an initial quantitative analysis of data.

Our results showed differences in the quality evaluations between different contexts. In the first experiment, were compared the laboratory and home-like viewing contexts. Experienced quality was higher in the laboratory and in the home-like context while there were no differences between the task loads between the contexts. Both contexts were very similar in the terms of social context, easy parallel tasks, and the audio surrounding. However, the visual circumstances might be one of the influencing factors to this difference. In the laboratory, participants have face up position, no reflections to the screen and relatively low lightening level. In contrast, home-like context participants viewing position was face down, such as normal mobile device use, and room had have higher level of illumination.

In the second experiment, the quality was evaluated to be at the same level in the laboratory and station contexts. The result is surprising as these circumstances provided relatively different circumstances for physical and social context and the conditions for the parallel tasks. As improvements for further field experiments, the sensors for capturing the illumination on screen as well as audio surrounding should be used. The experienced quality was slightly lower in bus context as well as it revealed a higher level of time pressures of task load compared to the other contexts. Bus context represent the most demanding context (user‟s pseudo-motion, location estimation, social surrounding and variable conditions of physical context) and therefore, it might required the high number of attention shifts between the evaluation task and environment. Oulasvirta et al. (2005) have previously demonstrated that the average span of attention on mobile device during the web browsing in the complex situations is only 4-8 seconds at the time. As the differences between the different evaluation contexts are relatively small, it is possible that the participants have used answering time to cope the divided attention between the surrounding and viewing task due to the short clip length. To confirm the existence of this phenomenon in our context comparison further steps of analysis are needed. We will complement the current findings with the results of user‟s attention while viewing, mapping with detailed contextual characteristics and qualitative impressions and experiences from these contexts.

In the 3D mode, the current mobile television and near future bitrates (320kbps and 768kbps) were equally rated independently on used framerates. The significant increase in the bitrate-framerate resources did not either improve the perceived quality or reach the most pleasantly rated 2D quality.

Current work has several limitations which may explain these results. First, the comparison of 2D and 3D mode might be strongly influenced by the used display technology. In our comparison, all 2D and 3D streams had same resolution (640X360). For 3D both left and right views were downscaled to 320x360 and the original aspect ratio is changed. In the case of 2D display, one of the channels was downscaled to (640x360). The plug-in for 3D display interleave the left and right halves column-wise: first column from left half, first column from right half, second column left, second column right, etc. The parallax barrier ensures that left and right views go to different directions. In our experiment, for 2D display, the display plug-in does nothing. However, because of the parallax barrier switched on, the nearest pixels go to the

Page 33: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 32 of 45

different directions. Because of the high cross-talk and the lack of disparity, the observer may perceive higher spatial resolution. This can be one reason why 2D is evaluated better than 3D. From this point of view there are two suggestions for the further work: 1) Other type of comparisons for 2D and 3D might be considered with more equal conditions form the viewpoint of downscaling and showing on the used display technology. In this case, 2D and 3D conditions are equally downscaled, 2D shows same view two times while 3D shows both left and right views. 2) Part of this study might be replicated with the other display technology as the current results are only based on parallax barrier technology. This note is also important to consider in the upcoming experiments when comparing the coding and/or transmission parameters.

Secondly, the 3D viewing task might be harder compared to conventional viewing task. Recent studies of Häkkinen et al. (2009) and Shibata et al. (2009) have carried out the comparisons of 2D and 3D with images and videos. They concluded that troublesome feeling while watching and focusing difficulties were commonly associated to 3D viewing, especially on mobile devices. The factors might exist in our study too. Qualitative analysis should complement current results and confirm this assumption. In addition, informal user observations underlined that participants actively searched for an optimal viewing conditions by moving device or themselves. The observational data, recorded during the experiments needs to be analysed in detail to confirm this.

Thirdly, current initial conclusions of parameter comparisons are limited to the results averaged over contents. However, quality evaluations are typically dependant on content (Jumisko-Pyykkö 2008). Further steps of analysis needs to provide the content dependant analysis.

Page 34: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 33 of 45

5 Acknowledgements The authors would like to thank Cinovent (Upper Middle Rhine Valley: http://www.cinovent.de), Red Star Studio (Knight‟s Quest 4D: http://www.redstarstudio.co.uk) and Stereoscape (Virtual Visit to Suomenlinna: http://www.stereoscape.net) for their contribution as stereoscopic content providers. Your generosity with providing suitable 3D material for our experiments made this research possible. We would also like to thank Tuomo Nyyssönen from Aidem Inc. for building the mobile camera system.

References

Audacity audio editor. http://audacity.sourceforge.net/, WWW-page, accessed 01/2009.

Avisynth Video Post-production Script System. http://www.avisynth.org/, WWW-page, accessed 01/2009.

Baddley, A.D. “Working Memory”. Oxford University Press, New York, NY, USA, 1998.

Beerends, J.G. & de Caluwe, F.E, “The influence of video quality on perceived audio quality and vice versa,” Journal of the Audio Engineering Society, 47 (5), 355-362(1999).

Childers, T. L., Houston M. J., Heckler, S. E. “Measurement of Individual Differences in Visualversus Verbal Information Processing”. The Journal of Consumer Research, Vol. 12, No2.(Sep.,1985), pp. 125-134. 1985.

Cohen, M., “Multimodal Integration - A Biological View,” In Proceedings of IJCAI'01, Seattle, WA, 2001.

DVB-H.org. http://www.dvb-h.org/. WWW-page, accessed 12/2008.

European Broadcasting Union (EBU), European Telecommunications Standards Institute. “Digital Video Broadcasting (DVB): DVB-H Implementation Guidelines”. ETSI TR 102 377 V1.2.1 (2005-11), 2005.

European Broadcasting Union (EBU). “Subjective listening tests on low-bitrate audio codecs”. Tech 3296, June 2003.

Faria, G., Henriksson, J., Stare, E., Talmola, P. “DVB-H: Digital Broadcast Services to Handheld Devices”. Proceedings of the IEEE, Vol. 94, No. 1, January 2006.

Gotchev, A. , Smolic, A., Jumisko-Pyykkö, S., Strohmeier, D., Akar, G. B., Merkle, P., Daskalov, N. "Mobile 3D television: Development of core technological elements and user-centered evaluation methods toward an optimized system". Proceedings of IST/SPIE Conference on Electronic Imaging, Volume 7256, 3D Video Delivery for Mobile Devices (2009); DOI:10.1117/12.816728

Gulliver, S. & Ghinea, G. “Defining User Perception of Distributed Multimedia Quality”. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP). Volume 2, Issue 4 (November 2006), pp. 241-257, 2006

Hands, D. S., “A Basic Multimedia Quality Model,” IEEE Transactions on Multimedia, Vol.6, No.6, 806-816(2004).

Page 35: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 34 of 45

Hart, S. G. & Staveland, L. E. “Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research”. Human Mental Workload (pp. 239-250). Amsterdam: North Holland Press, 1988.

Häkkinen, J., Kawai, T., Takatalo, J., Leisti, T., Radun, J., Hirsaho, A. & Nyman, G. (2008) Measuring stereoscopic image quality experience with interpretation based quality methodology. Proceedings of the IS&T/SPIE's International Symposium on Electronic Imaging 2008: Imaging Quality and System Performance V 27-31 January 2008, San Jose Convention Center, San Jose, California USA, Vol. 6808, pp. 68081B-68081B-12. doi:10.1117/12.760935. ITU-T P.911,

Recommendation P.911, “Subjective audiovisual quality assessment methods for multimedia applications”. International Telecommunication Union – Telecommunication Standardization Sector, 1998.

Jumisko-Pyykkö, S. "I would like to see the subtitles and the face or at least hear the voice: Effects of Picture ratio and Audio-video Bitrate Ratio on Perception of Quality in Mobile Television". Multimedia Tools and Applications (Journal), vol 36, No 1-2, 2008. DOI - 10.1007/s11042-006-0080-9

Jumisko-Pyykkö, S. & Häkkinen, J. “Evaluation of Subjective Video Quality on Mobile Devices”. Proceedings of the 13th annual ACM international conference on Multimedia 2005, Singapore 6.-12. November 2005. pp. 535-538. ISBN:1-59593-044-2

Jumisko-Pyykkö, S. & Hannuksela, M.M. "Does Context Matter in Quality Evaluation of Mobile Television? " . In the Proceedings of 10th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI 2008)

Jumisko-Pyykkö, S. & Strohmeier, D. 2008, “Report on research methodologies for the experiments”, MOBILE3DTV.

Jumisko-Pyykkö, S. & Vainio, T. “Framing the Context of Use for Mobile HCI”, review paper about mobile contexts of use, submitted IJMHCI, April 2009.

Jumisko-Pyykkö, S., Kumar Malamal Vadakital, V., Hannuksela, M.M. "Acceptance Threshold: Bidimensional Research Method for User-Oriented Quality Evaluation Studies. " . International Journal of Digital Multimedia Broadcasting, 2008

Jumisko-Pyykkö, S., Weitzel, M., Strohmeier, D. "Designing for User Experience: What to Expect from Mobile 3D TV and Video? " . Accepted to the First International Conference on Designing Interactive User Experiences for TV and Video. October 22 - 24, 2008, Silicon Valley, California, USA.

Kennedy, R.S., Lane, N.E., Berbaum, K.S., Lilienthal, M.G. “Simulator Sickness Questionnaire: an Enhanced Method for Quantifying Simulator Sickness”. International Journal of Aviation Psychology, 1993.

Knoche, H. O. & McCarthy, J. D. “Good News for Mobile TV”. Wireless World Research Forum, 2005

Knoche, H. O. & Sasse, M. A. “The Sweet Spot: How People Trade off Size and Definition on Mobile Devices”. Proceeding of the 16th ACM international conference on Multimedia, pp. 21-30, 2008

Knoche, H., McCarthy, J., Sasse, A. “Can Small Be Beautiful? Assessing Image Resolution Requirements for Mobile TV”. Proceedings of the 13th annual ACM international conference on Multimedia, pp. 829- 838, 2005

Page 36: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 35 of 45

Lawless, H. T. & Heyman, H. “Sensory Evaluation of Food: Principles and Practices”. Chapman & Hall/CRC, New York, NY, USA, 1998

McCarthy, J., Sasse, A., Miras, D. “Sharp or Smooth? Comparing the effects of quantization vs. frame rate for streamed video”. Proceedings of the SIGCHI conference on human factors in computing systems. The Association for Computing Machinery, New York, US, pp. 535-542, 2004

McGurk, H. & MacDonald, J. “Hearing lips and seeing voices”, Nature 264, 746-748 (1976).

MeGUI video conversion graphical user interface frontend. http://sourceforge.net/projects/megui/. WWW-page, accessed 01/2009.

Meilgaard, M. C., Civille, G. V., Carr, B. T. “Sensory Evaluation Techniques”, CRC Press, New York, NY, USA, 1999.

Nero Digital AAC Codec. http://www.nero.com/, WWW-page, accessed 01/2009.

O‟Hara, K., Mitchell, A. S., Vorbau, A. “Consuming Video on Mobile Devices”. Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 857-866, 2007.

Oulasvirta, A., Tamminen, S., Roto, V., and Kuorelahti, J. (2005). Interaction in 4-second bursts: The fragmented nature of attentional resources in mobile HCI. Proceedings of CHI 2005, ACM Press, New York, pp. 919-928.

Reeves, B. & Nass, C. “The media equation: How people treat computers, television, and new media like real people and places”. Cambridge University Press. 1996.

Ries, M. Puglia, R., Tebaldi, T., Memethova, O., Rupp, M. 2005. “Audiovisual Quality Estimations For Mobile Streaming Services”, Proc. of the 2nd International Symposium on Wireless Communication Systems, Siena, Italy (2005).

Rogers, E.M. “Diffusion of Innovations”. Free Press, New York, NY, USA, 5th edition, 2003.

Shibata, T, Kurihara, S., Kawai, T., Takahashi, T., Shimizu, T., Kawada, R. , Ito, A. Hakkinen, J., Takatalo, J., and Nyman, G.. Evaluation of stereoscopic image quality for mobile devices using interpretation based quality methodology. , Proc. SPIE 7237, 72371E (2009), DOI:10.1117/12.807080

Shimojo, S. & Shams, L. (2001), "Sensory modalities are not separate modalities: plasticity and interactions", Current opinion in neurobiology, 11.

Strohmeier, D., Jumisko-Pyykkö, S., Weitzel, M., Schneider, S. “Report on User Needs and Expectations for Mobile Stereo-video”. Tampere University of Technology, 2008.

Tikanmäki, A., Gotchev, A., Smolic, A., Müller, K. “Quality Assessment of 3D Video in Rate Allocation Experiments”. IEEE International Symposium on Consumer Electronics, 2008

VideoLAN VLC player. http://www.videolan.org/vlc/. WWW-page, accessed 01/2009.

Wijesekera, K., Srivastava, J., Nerode, A., Foresti, M. “Experimental evaluation of loss perception in continuous media”. Multimedia Systems, Volume 7, Issue 6 (November 1999), pp. 486-499, 1999

Winkler, S. & Faller, C. 2005. “Audiovisual quality evaluation of low-bitrate video”, Proc. SPIE/IS&T Human Vision and Electronic Imaging, vol. 5666. San Jose, United States of America. pp. 139-148 (2005)

Winkler, S. & Faller, C. “Perceived audiovisual quality of low-bitrate multimedia content”. IEEE transactions on multimedia, vol. 8, no. 5, October 2006, pp. 973-980, 2006

Page 37: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 36 of 45

Wynekoop, J.L. & Russo, N.L. “Studying system development methodologies: an examination of research methods”, Information Systems Journal (7) 1997, pp. 47-65.

x264 Video Codec. http://www.videolan.org/developers/x264.html. WWW-page, accessed 01/2009.

Page 38: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 37 of 45

APPENDIX 1

Example of questionnaire used in the expert evaluation of the scene characteristics

Rate each video according to the following parameters: spatial resolution (amount of details), temporal resolution (amount of movement), depth perception (amount of 3D) and dynamism of depth information (how much does it change during the clip).

Also, evaluate how many scene cuts there are in the clip and how much textual information the clip contains. In addition, specify if the clip contains speech, music and/or sound effects.

Knight’s Quest

Spatial low med high

Temporal low med high

Depth low med high

Depth Dynamism low med high

Scene cuts none some much

Text none some much

Sound speech music effects

Liberation of Plzen

Spatial low med high

Temporal low med high

Depth low med high

Depth Dynamism low med high

Scene cuts none some much

Text none some much

Sound speech music effects

Rhine Valley

Spatial low med high

Temporal low med high

Depth low med high

Depth Dynamism low med high

Scene cuts none some much

Text none some much

Sound speech music effects

Page 39: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 38 of 45

APPENDIX 2

LAB HOME STATION BUS

Physical

Spatial location, functional place and

space

Laboratory conditions

(ITU-T P.911)

Stimulated/analogue home

(Jumisko-Pyykkö et al. 2008)]

Cafe on the railway station

(Jumisko-Pyykkö et al. 2008)

Local bus from suburb to center

visa versa (Jumisko-Pyykkö et al.

2008)

Sensed environmental attributes

Audio: Quiet, Visual:

Calm; crepuscular light,

indoor

Audio: quiet, Visual: calm,

indoor

Audio: Noisy, Visual: Noisy,

Indoor[DYN]

Audio: Noisy, Visual: Noisy,

Variable light conditions [DYN]

Movements and mobility * Position: Straight

No motion, position: lean

backward/forward [DYN]

No motion, Position: lean forward

[DYN]

Pseudo-motion of use based on

bus movements, Postion: lean

forward [DYN]

Artefacts (other than answer sheet) * Accessories (e.g. pillows) Bag, refreshments Bag

Temporal

Duration ~30 min~30min, macrobreak

(Chipcase et al. 2006)

~25min, macrobreak (Chipcase et

al. 2006)

~25min, macrobreak (Chipcase

et al. 2006)

Time of day /weeks Vary Vary Vary Vary

Before-during-after * * * *

Actions in relation to time * Extra timeExtra time, time pressure at the

end of task

Extra time, time pressure a the

end of task

Task context

Multitasking: Task 1 Quality evaluation Quality evaluation Quality evaluation Quality evaluation

Multitasking: Task 2 * RelaxChecking the time of arriving train

(audio/visual task)

Viewing the bus-stop to come

(visual task)

Instructions for Task 2 given for the

participants

*

Watch programs to entertain

yourself during a quiet

Sunday afternoon

Watch programs while waiting for

the train to (Xlocation) in (Y-

minutes)

Watch programs while

commuting to the train station to

catch the train (or to the Hervanta

suburb to meet a friend)

Interruptions * No interruptions Physical and social [DYN] Physical and social [DYN]

Task type * Entertainment Entertainment Entertainment

Social context

Persons present Experimenter Participant alone Bystanders, experimenter [DYN]Bystanders, other people near,

experimenter [DYN]

Interpersonal actions * * Possible Possible

Culture * * * *

ANALYSIS OF THE CONTEXTS OF USE IN THE EXPERIMENTS (based on Jumisko-Pyykkö & Vainio submitted)

Page 40: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 39 of 45

Technical and informational context

Other systems and services * * Mobile usability lab Mobile usability lab

Interoperability * * * *

Informational artifacts and access * * * *

Mixed reality * * * *

Properties

Level of magnitude * * * *

Level of dynamism ([DYN] marked to

components)Static Static Dynamic (Physical, Task, Social) Dynamic(Physical, Task, Social)

Pattern (rythmic random) * * * *

Typical combinations * * * *

OTHER RELATED FACTORS

Motivations *

Entertain, pass time, relax

(Jumisko-Pyykkö et al. 2008)

Entertain, pass time(Jumisko-

Pyykkö et al. 2008)

Entertain, pass time (Jumisko-

Pyykkö et al. 2008)

Viewing distance Fixed Freedom to adjust Freedom to adjust Freedom to adjust

Level of volume on devices Fixed Freedom to adjust Freedom to adjust Freedom to adjust

Page 41: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 40 of 45

APPENDIX 3

Experiment 1; Detailed pair-wise comparisons of satisfaction ratings, averaged over the contents

Context Lab Home

Parameter combinations Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p

160kbps, 10fps 160kbps, 10fps

160kbps, 15fps 160kbps, 10fps -2,299 0,022 -1,604 0,109

160kbps, 15fps, 2D 160kbps, 10fps -9,022 0,000 -9,105 0,000

320kbps, 10fps 160kbps, 10fps -4,672 0,000 -2,405 0,016

320kbps, 15fps 160kbps, 10fps -3,620 0,000 -5,109 0,000

320kbps, 15fps, 2D 160kbps, 10fps -9,251 0,000 -9,424 0,000

768kbps, 10fps 160kbps, 10fps -4,881 0,000 -3,533 0,000

768kbps, 15fps 160kbps, 10fps -4,720 0,000 -5,733 0,000

1536kbps, 24fps 160kbps, 10fps -5,108 0,000 -4,893 0,000

160kbps, 10fps 160kbps, 15fps -2,299 0,022 -1,604 0,109

160kbps, 15fps 160kbps, 15fps

160kbps, 15fps, 2D 160kbps, 15fps -9,242 0,000 -9,224 0,000

320kbps, 10fps 160kbps, 15fps -5,827 0,000 -3,825 0,000

320kbps, 15fps 160kbps, 15fps -5,225 0,000 -5,922 0,000

320kbps, 15fps, 2D 160kbps, 15fps -9,339 0,000 -9,429 0,000

768kbps, 10fps 160kbps, 15fps -5,369 0,000 -4,486 0,000

768kbps, 15fps 160kbps, 15fps -5,641 0,000 -6,576 0,000

1536kbps, 24fps 160kbps, 15fps -6,097 0,000 -5,684 0,000

160kbps, 10fps 160kbps, 15fps, 2D -9,022 0,000 -9,105 0,000

160kbps, 15fps 160kbps, 15fps, 2D -9,242 0,000 -9,224 0,000

160kbps, 15fps, 2D 160kbps, 15fps, 2D

320kbps, 10fps 160kbps, 15fps, 2D -8,303 0,000 -8,763 0,000

320kbps, 15fps 160kbps, 15fps, 2D -8,744 0,000 -8,272 0,000

320kbps, 15fps, 2D 160kbps, 15fps, 2D -5,451 0,000 -5,913 0,000

768kbps, 10fps 160kbps, 15fps, 2D -8,293 0,000 -8,035 0,000

768kbps, 15fps 160kbps, 15fps, 2D -8,037 0,000 -7,472 0,000

1536kbps, 24fps 160kbps, 15fps, 2D -7,704 0,000 -8,163 0,000

160kbps, 10fps 320kbps, 10fps -4,672 0,000 -2,405 0,016

160kbps, 15fps 320kbps, 10fps -5,827 0,000 -3,825 0,000

160kbps, 15fps, 2D 320kbps, 10fps -8,303 0,000 -8,763 0,000

320kbps, 10fps 320kbps, 10fps

320kbps, 15fps 320kbps, 10fps -1,075 0,283 -2,262 0,024

320kbps, 15fps, 2D 320kbps, 10fps -8,969 0,000 -9,210 0,000

768kbps, 10fps 320kbps, 10fps -0,717 0,473 -1,811 0,070

768kbps, 15fps 320kbps, 10fps -0,118 0,906 -4,317 0,000

1536kbps, 24fps 320kbps, 10fps -1,277 0,202 -3,009 0,003

160kbps, 10fps 320kbps, 15fps -3,620 0,000 -5,109 0,000

160kbps, 15fps 320kbps, 15fps -5,225 0,000 -5,922 0,000

160kbps, 15fps, 2D 320kbps, 15fps -8,744 0,000 -8,272 0,000

320kbps, 10fps 320kbps, 15fps -1,075 0,283 -2,262 0,024

320kbps, 15fps 320kbps, 15fps

320kbps, 15fps, 2D 320kbps, 15fps -9,052 0,000 -8,979 0,000

768kbps, 10fps 320kbps, 15fps -1,425 0,154 -0,585 0,558

768kbps, 15fps 320kbps, 15fps -0,990 0,322 -2,022 0,043

Page 42: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 41 of 45

1536kbps, 24fps 320kbps, 15fps -2,260 0,024 -0,533 0,594

160kbps, 10fps 320kbps, 15fps, 2D -9,251 0,000 -9,424 0,000

160kbps, 15fps 320kbps, 15fps, 2D -9,339 0,000 -9,429 0,000

160kbps, 15fps, 2D 320kbps, 15fps, 2D -5,451 0,000 -5,913 0,000

320kbps, 10fps 320kbps, 15fps, 2D -8,969 0,000 -9,210 0,000

320kbps, 15fps 320kbps, 15fps, 2D -9,052 0,000 -8,979 0,000

320kbps, 15fps, 2D 320kbps, 15fps, 2D

768kbps, 10fps 320kbps, 15fps, 2D -9,074 0,000 -8,862 0,000

768kbps, 15fps 320kbps, 15fps, 2D -8,902 0,000 -8,703 0,000

1536kbps, 24fps 320kbps, 15fps, 2D -8,618 0,000 -8,762 0,000

160kbps, 10fps 768kbps, 10fps -4,881 0,000 -3,533 0,000

160kbps, 15fps 768kbps, 10fps -5,369 0,000 -4,486 0,000

160kbps, 15fps, 2D 768kbps, 10fps -8,293 0,000 -8,035 0,000

320kbps, 10fps 768kbps, 10fps -0,717 0,473 -1,811 0,070

320kbps, 15fps 768kbps, 10fps -1,425 0,154 -0,585 0,558

320kbps, 15fps, 2D 768kbps, 10fps -9,074 0,000 -8,862 0,000

768kbps, 10fps 768kbps, 10fps

768kbps, 15fps 768kbps, 10fps -0,061 0,951 -2,663 0,008

1536kbps, 24fps 768kbps, 10fps -1,199 0,231 -0,596 0,551

160kbps, 10fps 768kbps, 15fps -4,720 0,000 -5,733 0,000

160kbps, 15fps 768kbps, 15fps -5,641 0,000 -6,576 0,000

160kbps, 15fps, 2D 768kbps, 15fps -8,037 0,000 -7,472 0,000

320kbps, 10fps 768kbps, 15fps -0,118 0,906 -4,317 0,000

320kbps, 15fps 768kbps, 15fps -0,990 0,322 -2,022 0,043

320kbps, 15fps, 2D 768kbps, 15fps -8,902 0,000 -8,703 0,000

768kbps, 10fps 768kbps, 15fps -0,061 0,951 -2,663 0,008

768kbps, 15fps 768kbps, 15fps

1536kbps, 24fps 768kbps, 15fps -0,872 0,383 -1,782 0,075

160kbps, 10fps 1536kbps, 24fps -5,108 0,000 -4,893 0,000

160kbps, 15fps 1536kbps, 24fps -6,097 0,000 -5,684 0,000

160kbps, 15fps, 2D 1536kbps, 24fps -7,704 0,000 -8,163 0,000

320kbps, 10fps 1536kbps, 24fps -1,277 0,202 -3,009 0,003

320kbps, 15fps 1536kbps, 24fps -2,260 0,024 -0,533 0,594

320kbps, 15fps, 2D 1536kbps, 24fps -8,618 0,000 -8,762 0,000

768kbps, 10fps 1536kbps, 24fps -1,199 0,231 -0,596 0,551

768kbps, 15fps 1536kbps, 24fps -0,872 0,383 -1,782 0,075

1536kbps, 24fps 1536kbps, 24fps

Video bitrate Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p

320kbps, 10fps 160kbps, 10fps -4,672 0,000 -2,405 0,016

768kbps, 10fps 160kbps, 10fps -4,881 0,000 -3,533 0,000

768kbps, 10fps 320kbps, 10fps -0,717 0,473 -1,811 0,070

320kbps, 15fps 160kbps, 15fps -5,225 0,000 -5,922 0,000

768kbps, 15fps 160kbps, 15fps -5,641 0,000 -6,576 0,000

768kbps, 15fps 320kbps, 15fps -0,990 0,322 -2,022 0,043

Frames per second Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p

160kbps, 15fps 160kbps, 10fps -2,299 0,022 -1,604 0,109

320kbps, 15fps 320kbps, 10fps -1,075 0,283 -2,262 0,024

768kbps, 15fps 768kbps, 10fps -0,061 0,951 -2,663 0,008

Page 43: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 42 of 45

Presentation mode (2D/3D) Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p

160kbps, 15fps, 2D 160kbps, 15fps -9,242 0,000 -9,224 0,000

320kbps, 15fps, 2D 768kbps, 10fps -9,052 0,000 -8,979 0,000

2D 3D -12,919 0,000 -12,870 0,000

Context Satisfaction Acceptance

Combination 1 Combination 2 Z p χ2 p

home, 160kbps, 10fps lab, 160kbps, 10fps -2,082 0,037 0,138 0,710

home, 160kbps, 15fps lab, 160kbps, 15fps -1,209 0,227 0,523

home, 160kbps, 15fps, 2D lab, 160kbps, 15fps, 2D -2,670 0,008 0,180

home, 320kbps, 10fps lab, 320kbps, 10fps -3,297 0,001 7,225 0,007

home, 320kbps, 15fps lab, 320kbps, 15fps -0,346 0,729 0,390 0,532

home, 320kbps, 15fps, 2D lab, 320kbps, 15fps, 2D -0,266 0,790 0,012

home, 768kbps, 10fps lab, 768kbps, 10fps -1,863 0,062 0,108 0,742

home, 768kbps, 15fps lab, 768kbps, 15fps -0,254 0,799 0,516 0,472

home, 1536kbps, 24fps lab, 1536kbps, 24fps -2,256 0,024 0,225 0,635

home lab -4,729 0,000 7,985 0,005

Page 44: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 43 of 45

APPENDIX 4

Experiment 2; Detailed pair-wise comparisons of satisfaction ratings, averaged over the contents

Context Lab Bus Station

Parameter combinations Satisfaction Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p Z p

320kbps, 18kbps 320kbps, 18kbps

320kbps, 18kbps, 2D 320kbps, 18kbps -8,506 0,000 -5,280 0,000 -8,294 0,000

320kbps, 48kbps 320kbps, 18kbps -0,882 0,378 -0,530 0,596 -0,433 0,665

320kbps, 48kbps, 2D 320kbps, 18kbps -8,480 0,000 -4,896 0,000 -7,618 0,000

768kbps, 18kbps 320kbps, 18kbps -2,015 0,044 -0,582 0,561 -1,855 0,064

768kbps, 48kbps 320kbps, 18kbps -1,642 0,101 -1,477 0,140 -2,817 0,005

320kbps, 18kbps 320kbps, 18kbps, 2D -8,506 0,000 -5,280 0,000 -8,294 0,000

320kbps, 18kbps, 2D 320kbps, 18kbps, 2D

320kbps, 48kbps 320kbps, 18kbps, 2D -8,083 0,000 -5,446 0,000 -8,030 0,000

320kbps, 48kbps, 2D 320kbps, 18kbps, 2D -2,174 0,030 -0,494 0,621 -2,170 0,030

768kbps, 18kbps 320kbps, 18kbps, 2D -7,878 0,000 -5,405 0,000 -8,039 0,000

768kbps, 48kbps 320kbps, 18kbps, 2D -7,896 0,000 -5,320 0,000 -7,594 0,000

320kbps, 18kbps 320kbps, 48kbps -0,882 0,378 -0,530 0,596 -0,433 0,665

320kbps, 18kbps, 2D 320kbps, 48kbps -8,083 0,000 -5,446 0,000 -8,030 0,000

320kbps, 48kbps 320kbps, 48kbps

320kbps, 48kbps, 2D 320kbps, 48kbps -8,058 0,000 -4,873 0,000 -7,401 0,000

768kbps, 18kbps 320kbps, 48kbps -1,184 0,236 -0,611 0,541 -1,268 0,205

768kbps, 48kbps 320kbps, 48kbps -1,500 0,134 -1,047 0,295 -2,128 0,033

320kbps, 18kbps 320kbps, 48kbps, 2D -8,480 0,000 -4,896 0,000 -7,618 0,000

320kbps, 18kbps, 2D 320kbps, 48kbps, 2D -2,174 0,030 -0,494 0,621 -2,170 0,030

320kbps, 48kbps 320kbps, 48kbps, 2D -8,058 0,000 -4,873 0,000 -7,401 0,000

320kbps, 48kbps, 2D 320kbps, 48kbps, 2D

768kbps, 18kbps 320kbps, 48kbps, 2D -8,126 0,000 -4,851 0,000 -7,129 0,000

768kbps, 48kbps 320kbps, 48kbps, 2D -7,975 0,000 -4,656 0,000 -6,661 0,000

320kbps, 18kbps 768kbps, 18kbps -2,015 0,044 -0,582 0,561 -1,855 0,064

320kbps, 18kbps, 2D 768kbps, 18kbps -7,878 0,000 -5,405 0,000 -8,039 0,000

320kbps, 48kbps 768kbps, 18kbps -1,184 0,236 -0,611 0,541 -1,268 0,205

320kbps, 48kbps, 2D 768kbps, 18kbps -8,126 0,000 -4,851 0,000 -7,129 0,000

768kbps, 18kbps 768kbps, 18kbps

768kbps, 48kbps 768kbps, 18kbps -0,053 0,957 -0,142 0,887 -0,649 0,517

320kbps, 18kbps 768kbps, 48kbps -1,642 0,101 -1,477 0,140 -2,817 0,005

320kbps, 18kbps, 2D 768kbps, 48kbps -7,896 0,000 -5,320 0,000 -7,594 0,000

320kbps, 48kbps 768kbps, 48kbps -1,500 0,134 -1,047 0,295 -2,128 0,033

320kbps, 48kbps, 2D 768kbps, 48kbps -7,975 0,000 -4,656 0,000 -6,661 0,000

768kbps, 18kbps 768kbps, 48kbps -0,053 0,957 -0,142 0,887 -0,649 0,517

768kbps, 48kbps 768kbps, 48kbps

Video bitrate Satisfaction Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p Z p

768kbps, 18kbps 320kbps, 18kbps -2,015 0,044 -0,582 0,561 -1,855 0,064

768kbps, 48kbps 320kbps, 48kbps -1,500 0,134 -1,047 0,295 -2,128 0,033

Audio bitrate Satisfaction Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p Z p

320kbps, 48kbps 320kbps, 18kbps -0,882 0,378 -0,530 0,596 -0,433 0,665

320kbps, 48kbps, 2D 320kbps, 18kbps, 2D -2,174 0,030 -0,494 0,621 -2,170 0,030

Page 45: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 44 of 45

768kbps, 48kbps 768kbps, 18kbps -0,053 0,957 -0,142 0,887 -0,649 0,517

Presentation mode (2D/3D) Satisfaction Satisfaction Satisfaction

Combination 1 Combination 2 Z p Z p Z p

320kbps, 18kbps, 2D 320kbps, 18kbps -8,506 0,000 -5,280 0,000 -8,294 0,000

320kbps, 48kbps, 2D 320kbps, 48kbps -8,058 0,000 -4,873 0,000 -7,401 0,000

2D 3D -11,709 0,000 -7,186 0,000 -11,107 0,000

Context Satisfaction Acceptance

Combination 1 Combination 2 Z p χ2 p

bus, 320kbps, 18kbps lab, 320kbps, 18kbps -0,354 0,723 0,118

bus, 320kbps, 18kbps, 2D lab, 320kbps, 18kbps, 2D -1,988 0,047 1,000

bus, 320kbps, 48kbps lab, 320kbps, 48kbps -0,041 0,968 1,000

bus, 320kbps, 48kbps, 2D lab, 320kbps, 48kbps, 2D -3,571 0,000 1,000

bus, 768kbps, 18kbps lab, 768kbps, 18kbps -0,111 0,912 0,824

bus, 768kbps, 48kbps lab, 768kbps, 48kbps 0,000 1,000 0,607

station, 320kbps, 18kbps lab, 320kbps, 18kbps -1,231 0,218 3,841 0,050

station, 320kbps, 18kbps, 2D lab, 320kbps, 18kbps, 2D -0,516 0,606 1,000

station, 320kbps, 48kbps lab, 320kbps, 48kbps -1,681 0,093 2,700 0,100

station, 320kbps, 48kbps, 2D lab, 320kbps, 48kbps, 2D -3,377 0,001 0,754

station, 768kbps, 18kbps lab, 768kbps, 18kbps -1,555 0,120 2,857 0,091

station, 768kbps, 48kbps lab, 768kbps, 48kbps -1,727 0,084 4,033 0,045

station, 320kbps, 18kbps bus, 320kbps, 18kbps -0,414 0,679 0,804

station, 320kbps, 18kbps, 2D bus, 320kbps, 18kbps, 2D -1,467 0,142 0,500

station, 320kbps, 48kbps bus, 320kbps, 48kbps -0,899 0,369 0,096

station, 320kbps, 48kbps, 2D bus, 320kbps, 48kbps, 2D -0,135 0,892 1,000

station, 768kbps, 18kbps bus, 768kbps, 18kbps -0,681 0,496 0,307

station, 768kbps, 48kbps bus, 768kbps, 48kbps -1,293 0,196 0,180

bus lab -1,984 0,047 1,095 0,295

station lab -1,874 0,061 14,160 0,000

station bus -1,485 0,138 2,813 0,094

Page 46: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

MOBILE3DTV D4.4 results of the user-centred quality evaluation experiments

Page 45 of 45

APPENDIX 5

The following is an example Avisynth script used to encode the clips for the experiments. Texts starting with the „#‟-character are comments.

start = 927 #frame number of beginning of clip

end = 1848 #frame number of ending of clip

DirectShowSource("__vid__") #load video clip

KillAudio() #remove audio (processed separately)

Trim(start,end) #cut clip according to start and end frames

Crop(0,0,width/2,height) #crop left video channel

vid_l = Spline16Resize(320,360,0,0) #resize to 320x360

DirectShowSource("__vid__") #re-load video clip

KillAudio() #remove audio (processed separately)

Trim(start,end) #cut clip according to start and end frames

Crop(width/2,0,-0,-0) #crop right video channel

vid_r = Spline16Resize(320,360,0,0) #resize to 320x360

#video channels were squeezed horizontally to 320x360 (side-by-side format)

StackHorizontal(vid_l,vid_r) #place the channels side-by-side (640x360 pixels)

Addborders(0,60,0,60,$000000) #add black borders above and below video (640x480 pixels)

video = ChangeFPS(15) #change fps to the desired value by deleting frames

video #return produced video for frameserving

MeGUI served the video stream from the Avisynth script to the x264 decoder using the following syntax (here using the 768 kbps average video bitrate):

program --bitrate 768 --no-cabac --partitions p8x8,b8x8,i4x4 --threads auto --thread-input --progress --no-psnr --no-ssim --output "output" "input"

Page 47: Results of the user-centred quality evaluation experimentssp.cs.tut.fi/mobile3dtv/results/tech/D4.4_Mobile3DTV_v1.0.pdf · Results of the user-centred quality evaluation experiments

Mobile 3DTV Content Delivery Optimization over DVB-H System

MOBILE3DTV - Mobile 3DTV Content Delivery Optimization over DVB-H System - is a three-yearproject which started in January 2008. The project is partly funded by the European Union 7th

RTD Framework Programme in the context of the Information & Communication Technology (ICT)Cooperation Theme.

The main objective of MOBILE3DTV is to demonstrate the viability of the new technology ofmobile 3DTV. The project develops a technology demonstration system for the creation andcoding of 3D video content, its delivery over DVB-H and display on a mobile device, equippedwith an auto-stereoscopic display.

The MOBILE3DTV consortium is formed by three universities, a public research institute and twoSMEs from Finland, Germany, Turkey, and Bulgaria. Partners span diverse yet complementaryexpertise in the areas of 3D content creation and coding, error resilient transmission, userstudies, visual quality enhancement and project management.

For further information about the project, please visit www.mobile3dtv.eu.

Tuotekehitys Oy TamlinkProject coordinator

FINLAND

Tampereen Teknillinen Yliopisto

Visual quality enhancement,

Scientific coordinator

FINLAND

Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V

Middle East Technical UniversityError resilient transmission

TURKEY

Stereo video content creation and coding

GERMANY

Technische Universität IlmenauDesign and execution of subjective tests

GERMANY

MM Solutions Ltd. Design of prototype terminal device

BULGARIA

MOBILE3DTV project has received funding from the European Community’s ICT programme in the context of theSeventh Framework Programme (FP7/2007-2011) under grant agreement n° 216503. This document reflects onlythe authors’ views and the Community or other project partners are not liable for any use that may be made of theinformation contained therein.