Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

download Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

of 39

Transcript of Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    1/39

    DIRECT3D AND THE FUTUREGRAPHICS APIS

    Dave Oldcorn, AMD

    Dan Baker, Oxide Games

    Johan Andersson, EA / DICE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    2/39

    2 | AMD Direct3D Futures | March 20th, 2014

    NITROUS AND DX12

    Dan Baker

    Partner, Oxide Games

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    3/39

    3 | AMD Direct3D Futures | March 20th, 2014

    HAVENT WE BEEN HERE BEFORE?

    Goal of DX9

    Remember State blocks?

    Goal of DX10

    Large state groups

    Goal of DX11

    Deferred contexts

    Are we actually getting faster, or are CPUs just faster?

    Quite possible no perf improvements due to API features in

    Maybe adding features isnt the answer

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    4/39

    4 | AMD Direct3D Futures | March 20th, 2014

    DEEPLY ROOTED PROBLEM

    Coding design philosophies clash with real world

    OOP, data hiding, polymorphic design clashes with task-driven, data parallel

    Evident in language trends, striking disconnect between what is considered good code,

    Gap has always been there, but has grown in recent years

    15 years ago, processors often bound by computation

    Now, usually bound by cache misses, serialization, pipeline stalls, etc.

    Multi-Core CPUs are ineffectively utilized

    Heavy Iron , e.g. Big Object, Opaque memory is a dead end for performance

    The revolt is beginning in high performance graphics APIS, but will spread

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    5/39

    5 | AMD Direct3D Futures | March 20th, 2014

    BUT HOW MUCH FASTER?

    Biggest problem with industry today: Acceptance

    Only 1 secret in API design: That it can be done.

    And isnt that hard

    And our code isnt that ugly

    Star Swarm already demonstrating what is possible on a PC

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    6/39

    6 | AMD Direct3D Futures | March 20th, 2014

    D3D12 FEATURES THAT NITROUS USES

    True de-coupled multi-core rendering

    Expecting near linear thread scheduling

    Manual Hazard tracking

    Hazards have been resolved already

    Memory Heaps

    Bigger chunks of memory pool grouping make management simp

    Descriptor Tables

    Table exposure allows a cheaper way of binding textures

    Allows texture bindings to be shared between non-adjacent batch

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    7/39

    7 | AMD Direct3D Futures | March 20th, 2014

    WHATS DIFFERENT NOW?

    Spec Written

    SpecReviewed

    APIimplemented

    Released topublic

    First Engineuse

    Anado

    Thenn

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    8/39

    8 | AMD Direct3D Futures | March 20th, 2014

    WHATS DIFFERENT NOW?

    CreateSpec

    ImplementSpec

    Prototypeon ActualEngines

    Analyze

    Discusswith IHVs,

    ISVsStart Here

    If Ready, exit

    here to prep

    for release

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    9/39

    9 | AMD Direct3D Futures | March 20th, 2014

    IN THE SPIRIT OF CONTRIBUTING

    Oxide proud to announce

    that we have a proto-type of

    Nitrous running on D3D12*PR DISCLAIMER* This is

    not an official

    announcement regarding

    D3D12 support

    Porting from other modern

    APIs is much simpler thanporting from D3D11 to

    D3D12

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    10/39

    10 | AMD Direct3D Futures | March 20th, 2014

    EXPECTED RESULTS

    CPU Driver overhead largely put to rest

    Huge increases in driver reliability

    Huge decreases in frame latency, expecting median frame lat

    1.5 frames

    Increased perceptual responsiveness

    Never a dropped frame or stall due to driver API issues

    *Other OS events could cause stallsDriver should be far smaller, simpler to implement, IHVs can s

    time on optimizations

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    11/39

    DIRECT3D12 AND THE FUTURGRAPHICS APIS

    Dave Oldcorn, Direct3D12 Driver Architect, AMD

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    12/39

    12 | AMD Direct3D Futures | March 20th, 2014

    THE PROBLEM

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    13/39

    13 | AMD Direct3D Futures | March 20th, 2014

    THE PROBLEM

    Mismatch between existing Direct3D and hardware capabilities

    Lots of CPU cores, but only one stream of data

    State communication in small chunks

    Hidden work

    Hard to predict from any one given call what the overhead might be

    Implicit memory management

    Hardware evolving away from classical register programming

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    14/39

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    15/39

    15 | AMD Direct3D Futures | March 20th, 2014

    WHAT ARE THE CONSEQUENCES?

    WHAT ARE THE SOLUTIONS?

    SEQUENTIAL API

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    16/39

    16 | AMD Direct3D Futures | March 20th, 2014

    SEQUENTIAL API

    Sequential API: state for given draw com

    previous time

    Some states must be reconciled on the validation)

    All contributing state needs to be vis

    GPU isnt like this, uses command buffe

    Must save and restore state at start

    ...

    Draw

    Set PS CBDraw x 5

    Set VS CB

    Draw x 3

    Set Blend

    Set PS

    Set RT state

    Draw

    Set VS VB

    Draw

    ...

    (more, earlier)

    PS CB

    VS CB

    Blend state

    PS

    RT state

    Draw

    State contributing

    to drawAPI input

    THREADING A SEQUENTIAL API

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    17/39

    17 | AMD Direct3D Futures | March 20th, 2014

    THREADING A SEQUENTIAL API

    Sequential API threading

    Simple producer / consumer mo

    Extra latency

    Buffering has a cost

    More threading would mean dividin

    Bottlenecked on application or d

    Difficult to extract parallelism (Amd

    Application simulation

    PrebuildThread 0

    PrebuildThread 1

    Application Render Thread

    GPU Execution Queue

    Queued

    Buffer 0

    Queued

    Buffer 1

    ...

    Runtime / Driver

    Application

    Driver Thread

    Queued

    Buffer 2

    COMMAND BUFFER API

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    18/39

    18 | AMD Direct3D Futures | March 20th, 2014

    COMMAND BUFFER API

    GPUs only listen to command bu

    Let the app build them

    Command Lists, at the API le

    Solves sequential API CPU issue

    Application simulation

    Thread 0 Thread 1

    Build Cmd

    Buffer

    Build

    Cmd

    Buffer

    GPU Execution Queue

    Queued

    Buffer 0

    Queued

    Buffer 1

    ...

    Runtime / Driver

    Application

    BETTER SCHEDULING

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    19/39

    19 | AMD Direct3D Futures | March 20th, 2014

    BETTER SCHEDULING

    App has much more control over scheduling work

    Both CPU side and GPU

    Threads dont really share much resource

    Many more options for streaming assets

    Driver thread

    Create thread

    D3D11: CB building threads te

    GPU load still added but o

    Render work

    Create work

    GPU executes

    D3D12: CB building threads mo

    Create thread

    Build threads

    PIPELINE OBJECTS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    20/39

    20 | AMD Direct3D Futures | March 20th, 2014

    PIPELINE OBJECTS

    Pipeline objects get rid of JIT and enable LTCG for GPUs

    Decouple interface and implementation

    Were aware that this is a hairpin bend for many graphics

    engines to negotiate.

    Many engines dont think in terms of predicting state up

    front

    The benefits are worth it

    V

    PS

    Index

    Process

    Primitive

    Generation

    Rast

    Rendertarget

    Output

    ?

    ?

    RENDER OBJECT BINDING MISMATCH

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    21/39

    21 | AMD Direct3D Futures | March 20th, 2014

    RENDER OBJECT BINDING MISMATCH

    Hardware uses tables in vid

    BUT still programmed like a

    So one bind becomes:

    Allocate a new chunk of

    Create a new copy of the

    Update the one entry

    Write the register with th

    address

    SR

    CB

    On-chip

    root table

    (1 per stage) Pointer to table(here, textures)

    GPU Memory

    SRD table

    GPU Memory

    resource

    Pointer to table

    (constant buffers)

    Pointer to (+ params

    of) resource

    DESCRIPTOR TABLES

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    22/39

    22 | AMD Direct3D Futures | March 20th, 2014

    DESCRIPTOR TABLES

    Several tables of each type of resource

    Easy to divide up by frequency

    Tables can be of arbitrary size; dynamica

    provide bindless textures

    Changing a table pointer is cheap

    Updating a descriptor in a table is not

    SR.T[0]

    SR.T[3]

    SR.T[2]

    SR.T[1]

    UAV

    CB.T[1]

    CB.T[0]

    Samp

    SR.T[0][0]

    SR.T[0][2]

    SR.T[0][1]

    CB.T[1][0]CB.T[1][1]

    On-chip

    table Pointer to table(textures table 0)

    GPU Memory

    SRD table

    Pointer to table

    (constbuf table 1)

    KEY INNOVATIONS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    23/39

    23 | AMD Direct3D Futures | March 20th, 2014

    KEY INNOVATIONS

    Innovation CPU-side win GPU-s

    Command buffers Build on many threadsControl of scheduling

    Lower latency

    Simplified s

    Pipeline state objects

    Link at create time

    No JIT shader compiles

    Efficient batched updates

    Cheaper st

    Enable

    Bind objects in

    groupsCheap to change group

    Cheap to ch

    Fits hardwa

    Move work to Create Predictability Enables op

    KEY INNOVATIONS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    24/39

    24 | AMD Direct3D Futures | March 20th, 2014

    KEY INNOVATIONS

    Innovation CPU-side win GPU-s

    ExplicitSynchronisation EfficiencyRequired for bindless textures Less o

    Explicit Memory

    Management

    Efficiency

    Predictability

    Application flexibility

    Zero

    Control ove

    Do less

    Predictability, Efficiency

    Enables aggressive schedule

    FEWER BUGS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    25/39

    25 | AMD Direct3D Futures | March 20th, 2014

    NEW PROBLEMS(AND TIPS TO SOLVE THEM)

    NEW VISIBLE LIMITS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    26/39

    26 | AMD Direct3D Futures | March 20th, 2014

    More draws in does not automaticall

    triangles out

    You will not see full rendering rat

    averaging 1 pixel each. Wireframe mode should look diff

    rendering

    NEW VISIBLE LIMITS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    27/39

    27 | AMD Direct3D Futures | March 20th, 2014

    Feeding the GPU much more efficiently means exploring interesting new limits that were

    10k/frame of anything is ~1s per thing.

    GPU pipeline depth is likely to be 1-10s (1k-10k cycles).

    Specific limit: context registers

    Shader tables are NOT in the context

    Compute doesnt bottleneck on context

    APPLICATION IN CHARGE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    28/39

    28 | AMD Direct3D Futures | March 20th, 2014

    Application is arbiter of correct rendering

    This is a serious responsibility

    The benefits of D3D12 arent readily available without this condition

    Applications mustbe warning-free on the debug

    Different opportunities for driver intervention

    APPLICATION IN CHARGE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    29/39

    29 | AMD Direct3D Futures | March 20th, 2014

    No driver thread in play

    App can target much lower latency

    BUT implies app has to be ready with new

    GPU workDriver F1

    App Render Frame 1

    GPU F1

    Frame

    D3D11: No dead GPU time after 1stfra

    Dead

    Time

    First work sent to driver Driver buff

    No buffered present

    USE COMMAND BUFFERS SPARINGLY

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    30/39

    30 | AMD Direct3D Futures | March 20th, 2014

    Each API command list maps to a single

    command buffer

    Starting / ending a command list has an

    Writes full 3D state, may flush cache

    We think a good rule of thumb will be to t

    command buffers/frame

    Use the multiple submission API wheCB0 CB1 CB2CB0

    Multiple applications running on system

    Application 0 queue

    CB0 CB1 CB2

    CB0

    Application 1 queue

    GPU executes

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    31/39

    31 | AMD Direct3D Futures | March 20th, 2014

    ROUND-UP

    ALL-NEW

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    32/39

    32 | AMD Direct3D Futures | March 20th, 2014

    Theres a learning curve here for all of us

    In the main its a shallow one

    Compared at least to the general problem of multithreaded rendering

    Multithread is always hard.

    Simpler design means fewer bugs and more predictable performance

    WHAT AMD PLAN TO DELIVER

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    33/39

    33 | AMD Direct3D Futures | March 20th, 2014

    An early preview driver soon

    Release driver for Direct3D12 launch

    Continuous engagement

    With Microsoft

    With ISVs

    Bring your opinions to us and to Microsoft.

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    34/39

    34 | AMD Direct3D Futures | March 20th, 2014

    DX12 AND FROSTB ITE

    Johan Andersson

    Technical Director

    DX12 AND FROSTB ITE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    35/39

    35 | AMD Direct3D Futures | March 20th, 2014

    PC is very important for EA and weve been pushing hard to improve graphics capabilitie

    Excited to be working with Microsoft and the IHVs on Direct3D again!

    Good & very healthy collaboration between Microsoft, the IHVs and us game/engine dev

    DX12 is a really big step forward from DX11 or GL4

    DX12 FEATURES AND FROSTBITE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    36/39

    36 | AMD Direct3D Futures | March 20th, 2014

    Key DX12 features that are a great fit for Frostbite:

    Efficient parallel command buffers

    Descriptor tables

    Pipeline objects

    Explicit resource synchronization

    Explicit memory management

    DX12 is still in development so actively working with Microsoft & the IHVs to help make

    together and is efficient

    DX12 PLATFORMS

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    37/39

    37 | AMD Direct3D Futures | March 20th, 2014

    DX12 support on Windows 7 & most existing PC hardware is critical for us

    Huge user base still on Windows 7

    Gamers would see major benefits without upgrading

    DX12 support on Xbox One is critical for us

    Will lead to improved performance & quality for future Xbox One titles

    Almost all of our games are cross platform Gen4/PC

    Easier developmentrenderer is shared between Windows & Xbox One

    Looking forward to DX12 on mobile/tablets

    Power efficiency & low overhead is really key

    Need larger user base to target on Windows for mobile

    DX12 AND FROSTB ITE

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    38/39

    38 | AMD Direct3D Futures | March 20th, 2014

    We are building a DX12 renderer for Frostbite!

    Will work on GPUs from all vendorsbenefits a wide set of gamers

    Expected benefits over DX11:

    More stable and consistent performance

    Higher overall performance

    Move our design targetmore richer & more detailed game worlds

    Thinner driverseasier to work with / less of a black box

    More control for us developersnew techniques & optimizations

    Really happy that the full Windows & Xbox eco systems are moving to low-level graphic

  • 8/12/2019 Direct3D and the Future of Graphics APIs Oldcorn Baker Andersson

    39/39

    39 | AMD Direct3D Futures | March 20th, 2014

    QUESTIONS