PCI Express ® : Enabling New Opportunities for Graphics Barry Wagner Director of Technical...
-
Upload
ira-mckinney -
Category
Documents
-
view
215 -
download
0
Transcript of PCI Express ® : Enabling New Opportunities for Graphics Barry Wagner Director of Technical...
PCI Express®: Enabling New Opportunities for Graphics
Barry WagnerDirector of Technical MarketingNVIDIA Corporation
Session OutlineSession Outline
Increased bandwidth of PCI Express® enables a new direction for GPU design
Scalability of PCI Express creates a new category of gaming machines
PCI Express transition creates an opportunity for a new open standard for laptop graphics upgrades
PCI Express – Raising the Bar on Mainstream GraphicsPCI Express – Raising the Bar on Mainstream Graphics
Graphics companies utilize PCI Express bandwidth in place of dedicated graphics memory
Substantial cost saving for consumers
Graphics pipeline changed to allow rendering directly to system memory
Combination of HW and SW used to manage surface allocation
Users get better compatibility, stability, reliability from new, frequently improved drivers and GPU architectures at a lower price.
Rendering to System MemoryRendering to System Memory
Fundamental new direction in discrete GPU design
Direct rendering to system memory significantly reduces local graphics memory requirements
Better laptop performance at a lower power and cost due to fewer memories
Delivers latest graphics architectures and performance to affordable price points
Enabled by PCI Express
PCI Express Bandwidth ImprovementsPCI Express Bandwidth Improvements
Improved Write Bandwidth is Critical for Rendering Directly to System Memory
Bandwidth (GB/s)
READ1 WRITE2
AGP-8X 2 0.2PCI Express x16 4 4
1 READ from system memory to GPU2 WRITE from GPU to system memory
Benefits of PCI Express System Cache vs. Benefits of PCI Express System Cache vs. Conventional AGP ArchitectureConventional AGP Architecture
CoreLogicCoreLogic
SystemDRAM
SystemDRAM
3.2 GB/S
128MBGraphics Memory
4GB/scannot render efficiently to system memory
TraditionalGPU
16Mx16
16Mx16
CoreLogicCoreLogic
PCI ExpressGraphicsDevice
PCI ExpressGraphicsDevice
SystemDRAM
SystemDRAM
96MB dynamically
allocated for graphics
128MBGraphics Memory
13.6GB/s Peak bandwidth
32MB
4Mx324Mx32
4GB/s 4GB/s
5.6 GB/S
4GB/s Bi-directional PCI Express data paths enable efficient rendering to system memory
Typical 3D PipelineTypical 3D Pipeline
Triangle SetupTriangle Setup
L2 TexL2 Tex
Shader Instruction DispatchShader Instruction Dispatch
Fragment CrossbarFragment Crossbar
Graphics Memory
Graphics Memory
Graphics Memory
Graphics Memory
Z-CullZ-Cull
Shared Memory Through MMUShared Memory Through MMU
Render directly to system memory at full speed
Texture from system memory at full speed
Dynamically allocate surfaces anywhere
Present NVIDIA TurboCacheTM architecture
Triangle SetupTriangle Setup
L2 TexL2 Tex
Shader Instruction DispatchShader Instruction Dispatch
Fragment CrossbarFragment Crossbar
Graphics Memory
Graphics Memory
Graphics Memory
Graphics Memory
Z-CullZ-Cull
MMUMMU
SystemMemorySystemMemory
0
1
2
3
4
5
Shared Memory Enables Higher Performance at Shared Memory Enables Higher Performance at Lower PriceLower Price
Device QTY DRAM SpeedDRAM1
Price
Local Memory
CostShared Memory 4Mx32 DDR 2 350MHz 3.50$ 7.00$ Traditional 16Mx16 DDR 4 200 MHz 3.50$ 14.00$ Shared Memory 8Mx16 DDR 4 200MHz 2.00$ 8.00$ Traditional 16Mx16 DDR 8 200 MHz 3.50$ 28.00$
128MB
256MB
128MB Shared Memory
256MB Shared Memory
128MB
64-bit
Local Memory
256MB
128-bit
Local Memory
Integrated Graphics
1DRAM pricing changes constantly. A variety of market sources are available to confirm the approximate pricing reflected here. http://www.dramexchange.com is one such source.
3DMark05 10x7 0x AA / 0x Aniso
0
0.2
0.4
0.6
0.8
1
Integrated Graphics 128MB Traditional 64-bit GPU 128MB Shared Memory
WB 99 High-End
Disk
SysMark2004
PCMark 2004
Business Winstone
2004
Biz WS CC
WB 99 Business
Disk
Shared System Memory Design Outperforms Integrated Graphics and Traditional GPU architectures in a Variety of System Benchmarks
No Negative Impact to System PerformanceNo Negative Impact to System Performance
More Usable PCI Express Bandwidth = More Usable PCI Express Bandwidth = Faster PerformanceFaster Performance
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Early PCI Express Core Logic New PCI Express Core Logic
4 GB/s
6 GB/s
3DMark05 10x7 0x AA / 0x Aniso
No need to wait for new applications to use the new bandwidth like we experienced with AGP
Key PointsKey Points
PCI Express has enabled a new direction in GPU Design
Graphics performance scales with usable PCI Express bandwidth improvements
Cost reductions for graphics memory enable more consumers to experience games the way they were meant to be played.
Laptops get better performance, lower costs, and longer battery life with system memory cache
PCI Express Returns Scalable Graphics PCI Express Returns Scalable Graphics Performance to the PCPerformance to the PC
AGP was NOT very scalableArchitected for a single graphics slot
Scaled frequency over the bus life-cycle
PCI Express planned for scalability Scalable bus widths : x1, x2, x4, x8, x16
Port Splitting
Down-shifting
Frequency scaling: 2.5GHz (Gen1), 5GHz (Gen2)
General purpose bus means multiple slots available for scaling performance
Port-Splitting in the Core LogicPort-Splitting in the Core Logic
Dividing a slot into multiple independent links
BenefitsAllows 2 or more GPU on a single x16 slot without the need for a bridge chip
Does not require multiple x16 slots on the motherboardEnables performance for wider install base of motherboards
DrawbacksHigher power density on a single card
May require larger than standard form factor cards
Optional feature in PCI Express specification so risk that a specific card may not work in every system.
Down-Shifting on the MotherboardDown-Shifting on the Motherboard
Wiring fewer lanes to a slot than the max x8 slots wired for x4 link width
x16 slots wired for x8 link widths
Why?There will always be a limited number of lanes
BenefitsCheapest way to get 2 slots that are capable of fitting the x16 edge connector used on graphics cards
2x the GPU performance with standard cards
Avoid inventory management of both x8 and x16 cards
DrawbackMotherboard has to support slot power level
Scalable Graphics ImpactScalable Graphics Impact
PCI Express High-End Graphics Specification75W from x16 slot
75W from HE power connector
Scalable solutions demand more power for the graphics subsystem
Graphics companies have already started to request the PCI-SIG® begin addressing a demand for greater than 150W cards.
When do Systems Benefit from Dual When do Systems Benefit from Dual GPU Designs? GPU Designs?
When you play Modern GPU intensive games at:
High Resolutions(1600x1200) with:Anti-Aliasing (4xAA)
Anisotropic Filtering (8xAF)
Example Applications that Scale: Battlefield Vietnam, Doom3, Far Cry
Game images are the property of their respective owners. All Rights Reserved.
Nvidia’s Design With Scalable PCI ExpressNvidia’s Design With Scalable PCI Express
GPU-to-GPU Interconnect1GB/s digital link
Provides pixel and synchronization information
Multiple system implementations supported
Rigid PCB with standard edge connectors
Flexible cables with standard edge connectors
Embedded designs
Optimized within SoftwareCan be enabled or disabled by driver and/or user
User may prefer to enable 4 displays for some applications in non-scaled mode
Solutions Possible for Any Solutions Possible for Any Compatible SystemCompatible System
Dual x16 slots fully-wired motherboard:2 compatible standard graphics cards with scalability support
Rigid or flexible cable between the cards
Single x16 slot without Port Splitting:2 GPUs on a single card plus a x16 to x8 PCI Express bridge
Inter-connect embedded in PCB
Single x16 slot with Port Splitting:2 GPUs on a single card
Inter-connect embedded in PCB
Dual x16 slots each down-shifted to x8:2 compatible graphics cards with scalability support
Rigid or flexible cable between the cards
Nvidia’s Dynamic Load Balancing TechnologyNvidia’s Dynamic Load Balancing Technology
HW and Driver work together to determine best algorithms to share workload between cards
Alternate Frame Rendering (AFR)
Split Frame Rendering (SFR)
Performance Improvement depends on ability to share work across cards
Works with almost any 3D application
Today some applications benefit more than othersSome popular applications show 1.7x – 2x increase
Future games will trend toward 2x increase as indicated by 3dMark05
NVIDIA SLINVIDIA SLITMTM
A new class of gaming performanceA new class of gaming performance
Benchmarks run at 1600x1200 4x/8x on nForce 4 SLI Motherboard with AMD Athlon 64FX
0%20%40%60%80%
100%120%140%160%180%200%
3DMark05 Doom 3 FlatOut Half-Life 2 Thief: DeadlyShadows
Warhammer40,000: DoW
SINGLE GPU DUAL GPU
Key Points:Key Points:
PCI Express provides for better scalability
Factor in dual GPU architectures into your graphic power budgets
A New Opportunity to Address Mobile Graphics Form A New Opportunity to Address Mobile Graphics Form FactorFactor
Desktop PC standard form factors has enabled growth and innovation
Rapid time to market
Range of choice addresses all market segments
Drives competition
The Mobile PC Platform has not benefited from a common form factor
Mobile PC development cycle is long
Per platform custom graphics integration is expensive
Custom design limits OEM/ODM choices & ultimately the consumer’s
Notebook Graphics Modules TodayNotebook Graphics Modules Today
Engineering resources are not leveraged
Custom BIOS
Custom connectors
Custom power delivery
Custom cooling
Custom Power Management
PCI Express is a catalyst for changePCI Express is a catalyst for change
PCI Express electrical changes require a new look at the platform
Robust differential links can tolerate modular design
No longer able to just send out the old footprint compatible laptop.
PCI Express brings new power management options
Scalable link widthsPower down links you don’t need
Notebook Platform has matured substantiallyPlatform requirements are better understood
Lessons learned from current module experience
What issues must be addressed by a new Mobile PC What issues must be addressed by a new Mobile PC Graphics Architecture?Graphics Architecture?
Broad industry participation
Open Architecture
A common host interface (PCI Express)
Enable broad range of power, thermal, & mechanical boundaries
Cover a variety of display output technologiesVGA, DVI, LVDS, Video (HDCP)
A common software layer and partitioning for the video BIOS and system BIOS
MXM – Mobile PCI Express ModuleMXM – Mobile PCI Express Module
Open architecture co-developed with many leading ODM/OEM
AOpen, Arima, Asustek, Clevo, FIC, Mitac, Quanta, Tatung, Uniwill, Wistron and more.
Enables a consistent graphics interface across all PCI Express notebooks
Supports up to x16 PCI Express
One design, many notebooks
Use different graphics solutions, from ANY vendor
Potential for consumer upgradeable graphics
Modules already designed for various NVIDIA, ATI, and S3 products
Key Points:Key Points:
The PCI Express transition is creating new opportunities for competition in the notebook PC
The notebook platform is evolving toward a common modular graphics form factor in many segments
MXM is one open architecture developed with the notebook industry to leverage the limited engineering resources.
Now is the time for the industry to come together on a common form factor.
Call To ActionCall To Action
Consider system memory cache solutions for your mainstream desktop and mobile markets
For maximum graphics performance, build systems with multiple x16 slots to enable best scalability of graphics hardware
Consider whether MXM is right for your next laptop design or purchasing decision
Let your PCI-SIG reps know you want the graphics industry power and scalability issues to continue to be addressed by the PCI-SIG.
For More InformationFor More Information
PCI Express Specificationswww.pcisig.com
NVIDIA PCI Express Product Informationhttp://www.nvidia.com/page/pci_express.html
NVIDIA TurboCache Technologyhttp://www.nvidia.com/page/turbocache.html
NVIDIA SLI Technologyhttp://www.nzone.com/object/nzone_sli_home.html
MXM Technologyhttp://www.nvidia.com/page/mxm.html
Community ResourcesCommunity Resources
Windows Hardware & Driver Central (WHDC)www.microsoft.com/whdc/default.mspx
Technical Communitieswww.microsoft.com/communities/products/default.mspx
Non-Microsoft Community Siteswww.microsoft.com/communities/related/default.mspx
Microsoft Public Newsgroupswww.microsoft.com/communities/newsgroups
Technical Chats and Webcastswww.microsoft.com/communities/chats/default.mspx
www.microsoft.com/webcasts
Microsoft Blogswww.microsoft.com/communities/blogs