Dragged, Kicking and Screaming:

17
Dragged, Kicking and Screaming: Multicore Architecture and Video Games

description

Dragged, Kicking and Screaming:. Multicore Architecture and Video Games. Summary of Topics:. Console Architecture Meaning of Paper’s Title/Why the Video Game Developer HATED the new Techniques/Problems The Future. Video Game Architecture. For the most part, same as computer: - PowerPoint PPT Presentation

Transcript of Dragged, Kicking and Screaming:

Page 1: Dragged, Kicking and Screaming:

Dragged,Kicking and Screaming:

Multicore Architecture and Video Games

Page 2: Dragged, Kicking and Screaming:

Summary of Topics:Console ArchitectureMeaning of Paper’s Title/Why the Video Game Developer HATED the newTechniques/ProblemsThe Future

Page 3: Dragged, Kicking and Screaming:

Video Game ArchitectureFor the most part, same as computer:

Very operating system-linked.With PCs, almost always have been games.Mac Gaming is sparse, recently increased.Linux users have to compile/make their own.

Console Games = primarily single-core processors…until 2005.

Page 4: Dragged, Kicking and Screaming:

XBOX 360• 3.2 GHz “Xenon” triple-core PowerPC, 2 hardware threads per processor • 256 MB main RAM• 500 MHz ATI “Xenos” GPU -CPU accesses memory through the GPU! • GPU has 10 MB RAM embedded frame buffer

Page 5: Dragged, Kicking and Screaming:

XBOX 360 vs. Playstation 3

Triple-Core PPC

• Xbox 360 - 512 MB, 700 MHz, GDDR3, shared by CPU and GPU

• CPU accesses memory through the GPU!

• GPU has 10 MB RAM embedded frame buffer

Multicore Cell Engine

PS3 - 512 MB total 256 MB 3.2 GHz XDR main RAM for the CPU 256 MB 700 MHz GDDR3 video RAM for the GPU

Page 6: Dragged, Kicking and Screaming:

Multiple synergistic core units that attach to local stores, which then feed into DMAs going into the on-chip bus. One set-off PPE

(Power Processing Element), with an L1 and L2 cache. Developers are having some serious problems with this model.

Cell Architecture

Page 7: Dragged, Kicking and Screaming:

Why So Unhappy?Delays, setbacks, ecetera = unhappy fans.

Yu Suzuki; Saturn Virtua Fighter: “One very fast central processor would be preferable...I think that only one in 100 programmers are good enough to get this kind of speed out of the Saturn.”

Not implementing parallelism, use of multicore architecture, etc = unhappy fans.If game developers utilize parallelism, the game will be delayed – 6 months, 1 year?

Page 8: Dragged, Kicking and Screaming:

“I guess, if we have to.”

Multicore Parallelism

Implementations

Page 9: Dragged, Kicking and Screaming:

Beginning Techniques• Patches, so computers at least realize there’s multiple cores

available.

• Intel releases several multicore assists; especially in the beginning (coaxing people into it)

• Building Blocks• Codeplay’s sieve compilers

• Broke a program into “sieve blocks” where automatic parallelization could be utilized

Page 10: Dragged, Kicking and Screaming:

What do we do today?Multithreading from the ground up

Decent (and fast!) parallelizationOne of two main ways:

Every process on a different threadDependencies galore~!

Page 11: Dragged, Kicking and Screaming:

Main gaming thread, with branches coming off for specific parts of the game and splintering into other threads.

Particularly beastly programs get their own multithreading implementations.Networking and I/O get their own threads.

“Best” Multithreading Approach

Page 12: Dragged, Kicking and Screaming:

CASE EXAMPLE: Kameo, which achieved 2.2~2.5 cores in 6mos.Rendering, decompression were on a separate thread

Latter saved space on the DVD and improved load times for the game. Additionally, file I/O was separated onto two threads – one for reading, and one

for decompressing.

CASE EXAMPLE: Kameo

Page 13: Dragged, Kicking and Screaming:

Best Processes for MTFile decompression – improve load times.Rendering – separate update and render; can be problematicPhysics Engine? – Physics/Update/Render, but latency issues.Graphical Fluff – always and forever. Artificial Intelligence - position independency of data, cache coherency

Page 14: Dragged, Kicking and Screaming:

Cascade ProjectFix dataflow by sending data from the parent to the child before the parent had completed!Respect dependencies, divided AIResulted in reducing “the average time per frame from 15.5ms using a single thread to 7.8ms using eight threads.”51% Speedup!Work in progress – CDML

List constraints in language instead of working out later.

Page 15: Dragged, Kicking and Screaming:

Multithreading is TrickyThreads can fight over the cacheDependenciesData corruption, deadlocksBugs might not be apparent right awayDebugging sets developers back

Page 16: Dragged, Kicking and Screaming:

The FutureARM’s GPU/CPU ChipIntel’s Larrabee Chip Mobile Gaming Platforms laugh for now…Unreal 4 Engine – “We’re waiting for massively multicore processors.”

Page 17: Dragged, Kicking and Screaming:

Thanks for watching!It’s just not that easy anymore.