VRIF Guidelines 1.0 NAB 2018 Master Class · 2018. 4. 20. · Viewport-Independent (VPI)...
Transcript of VRIF Guidelines 1.0 NAB 2018 Master Class · 2018. 4. 20. · Viewport-Independent (VPI)...
VRIF Guidelines 1.0
NAB 2018 Master Class
01/08/2018 Virtual Reality Industry Forum 1
Mauricio Aracena, Ericsson
Secretary & Board Member VR Industry Forum
• Guidelines Overview and organization– Production
– Distribution
– Security
• Architecture and Interoperability points
Agenda
Virtual Reality Industry Forum 2
Generally
Provide implementation
guidance that covers the
technical aspects and
human factors pertaining to
content production,
distribution and security
Organization
■ Guidelines Organized by Vertical
■ Currently, one Vertical:
Download or Streaming of VR360 Content
■ Per Vertical:
■ Use Case(s)
■ Technical Enablers (specifications)
■ Guidelines for Services Providers
■ Guidelines for Platform Developers
■ Guidelines for App Developers
■ Next Vertical: Live content
Content Production
■ Content production aspects relating to reduction of Physio-cognitive impacts for a comfortable viewing experience.
■ Maximize enjoyment, minimize side-effects
■ Camera position, motion and alignment, Scene transitions, Limits to object positioning and speed
■ Technical aspects of VR Media formats■ Resolutions, Frame Rates, Color depths
■ Content exchange metadata■ 360 video master file format containing descriptive metadata
■ Spatial audio definitions for objects and scenes, described according to Audio Definition Model (ITU-R BS.2076-1)
Aspect Ratio: 2:1 Equirectangular (For full 360 Video), Framed VR use metadata
Video Resolution: Minimum 4k x 2k per eye frame stacked for stereoscopic
25 / 30 50 /60
50 /60 75
90
100 / 120
Frame Rate: Monoscopic Stereoscopic (Higher rates for better image fusion)
Video Format: Bit Depth: 10 Color Subsample: 4:2:2 Color Space: BT.709 Scan: Progressive
Audio Format: Audio Definition Model (ADM) 4:2:2PCM – 10/32 bit floating point Sample Rate – 48k (min)
Delivery Formats: Opt 1: MXF Program Contribution (AMWA AS-11X1 as per DPP specification)
Opt 2: IMF Application 2e / JPEG2000 minimum data rate 150Mb/s
Opt 3: DnX HR HQ
Opt 4: ProRes422 HQ
Content Format
Content Distribution
■ Based on “OTT Download and Streaming” cases
■ Guidance and recommendations to implement VR video
and audio profiles from MPEG OMAF■ 3D Audio media profile
■ Viewport Independent media profile
■ Viewport Dependent media profile
■ Configuration of packing, projection and supporting
metadata
■ Use of Adaptation Sets for MPEG DASH based streaming
OMAF Audio profile
- Media Profile: 3D Audio Baseline
- Codec: MPEG-H Audio, Low Complexity
profile, Level 1,2 or 3
- Max Sampling Rate: 48 kHz
- ISO BMFF Media File format
- Support: channels, objects and Higher-
Order Ambisonics (HOA) Binaural VR, Fraunhofer IIS
• View port Independent (VPI)– Deliver the entire 360 video to the VR Headset
– Delivered as traditional video
– Not cost effective
• High resolution, high framerates very high bitrates for a high quality VR experience
• User normally sees 12-14% of the image
• View port Dependent + extra (VPD)– Deliver only a portion (View port + extra) of the 360 video
– Reduced bandwidth requirements
– Feedback to server (uplink connection is required) Latency is important
VR Technology based on MPEG OMAF
Full 360 Image
Viewport
Viewport
12% - 14%
Viewport
Viewport + extra
9
OMAF Video Profiles
Viewport-Independent (VPI) Viewport-Dependent (VPD)
Device req. HEVC Main10 Profile, Main Tier, Level 5.1 (e.g. 4Kx2K@60fps), single decoder instance
File FormatProjection Maps
- ISO BMFF DASH- Equirectangular (ERP)
- ISO BMFF DASH Extractors- Equirectangular (ERP) or Cubemap (CMP)
Description - Simple- Viewport-agnostic delivery and decoding- Whole sphere encoded with single bitstream
- Viewport dependent delivery- Uses tiles to deliver streams with different resolution
and quality- Region Wise Packing (RWP) to create a projected
frame from a decoded frame.
Viewport quality - Lower quality at viewport- A small portion of level 5.1 decoding is
displayed
- Higher resolution in the viewport- reduced bitrate
Additional requirements
- minimal file format and DASH-level extensions
- Support for ISO BMFF extractors for tiles selection on viewport changes
- Motion-to-high resolution latency constrained
VP Independent vs DependentPacked frame (decoded) Projected frame CMP (rendered)
VP
IV
PD
Projected frame = packed frame = 4K
Projected frame (6K) > packed frame (4K)
Packed frame is within the limits of level 5.1 decoder (max 4K@60fps)
4K = 4096 H × 2160 V6K = 6144 H x 3072 V
Viewport dependent (VPD)
Packed frame (decoded)
Projected frame (rendered)
768x23043078x2304
6KRed regions are upscaled using
OMAF metadata (rwpk)Blue regions have the same
resolution as in a packed frame
Packed frame is within the limits of level 5.1 decoder
(max 4K@60fps)
4K
VPD tiling rearranging illustration
VPD tiling rearranging illustration
VPD cubemap
VPI vs VPD deployment
Full 360 Image
Viewport
12% - 14%
Storage &
play-out server
Decoding original video: 8K (no optimization)
8K 8K8K
Full 360 Image
Viewport
12% - 14%
Storage &
play-out server
Decoding 4K (Viewport Independent)
4K (360 video) 4K8K
Full 360 Image
Reducing quality
@ playout
Viewport
12% - 14%
Original video:
8K (or 12K, 22K)
Storage &
play-out server
Decoding 4K (Viewport Dependent)Viewport @ 4K 4K8K
Delivery
network
Viewport
selection
feedback @ low
delay
+ Low Bitrate at NW edge
+ Lower decoding performance
and better device reach
- visual quality reduced
+ Low Bitrate at NW edge
+ Lower decoding performance
and better device reach
+ High visual quality
- High Bitrate at NW edge
- High decoding performance
and low device reach
+High visual quality
Encryption: Viewport Independent
• The Viewport Independent Media Profile is compatible with commonly deployed DRM
functionalities and encryption work flows under verification in VRIF
• The rendering process may differ across players and platforms. – DASH-IF interoperability guidelines, [DASH-IF IOP] clause 7, provides good overview of widely deployed DRM
and encryption systems
Example of rendering transformation on protected video memory from Android online documentation -https://source.android.com/devices/graphics/arch-st
Encryption: Viewport dependent
• New requirements on content encryption
• The DASH Access engine in the device performs DASH sub-segment concatenation, it will
construct a single ISOBMFF file.
• This file will contain encrypted data from individual DASH streams for each tile that will make
up the frame, concatenated into the single ISOBMFF file.
• This restricts the AES encryption mode that can be used, i.e. ctr and cbc1 cannot be used.
• Recommended encryption mode is cbcs.
• Note this area remains a work in progress and further investigation is required to verify the
operation of VR Players and to analyse the performance implications of this approach.
Architecture and interoperability points