Godson-3B1500

Post on 23-Feb-2016

47 views 0 download

Tags:

description

Mostafa Koraei A presentation for DSP Implementation Course. Godson-3B1500. Main Reference : Godson-3B1500: A 32nm 1.35GHz 40W 172.8GFLOPS 8-Core Processor on ISSCC 2013. BLX IC Design Corporation founded in 2002 in Beijing BLX is fabless Fabricated in STMicroelectronics - PowerPoint PPT Presentation

Transcript of Godson-3B1500

GODSON-3B1500

Mostafa Koraei

A presentation for DSP Implementation Course

Main Reference : Godson-3B1500: A 32nm 1.35GHz 40W 172.8GFLOPS 8-Core Processor on ISSCC 2013

HISTORY OF GODSON BLX IC Design Corporation founded in

2002 in Beijing BLX is fabless Fabricated in STMicroelectronics Loongson processors with Linux are

strategic for china Based on MIPS64 Prof Hu Weiwu

[1,2]

LOONGSON1 OR GODSON-232 Instruction set is MIPS32 Internal architecture is different 32 bit , 266 MHz O.18 micron CMOS 8 KB Data/Instruction Cache 200 MFLOPS In 2007 ICT bought MIPS license

[1,2]

LOONGSON2 64 bit architecture 500 MHzLoongson2E• 4 way superscalar out of order execution• 2 ALU , 2 FPU• Separate 64/64 KB instruction and data L1 caches• On chip 512 KB L2 cache• Max 7 w at 1 GHzLoongson2F• DDRII memory controller• Max 4 W at 1 GHz [2,3]

LOONGSON2Loongson2F• Software-controlled dynamic power

managementLoongson2G• 1 GHz 65 nm • 3 Watt• 64KB+64KB L1 (four-way)• 1 MB L2 cacheLoongson2H• SATA, USB, GMAC controller

[3]

[2,3]

GODSON 3 Scalable architecture at 2010 65 nm ,1 GHz , Quad core ,Max 15 Watt 64 entry register file Separated 16 entry reservation station for

fixed point and floating point Reconfigurable core GS464 or Gstera Dynamic L2 cache migration DMA reconfigurable to data is from or to L2

or main memory[4,6]

MICRO ARCHITECTURE

BTB : Branch Target BufferBHT: Branch History TableAGU: Address Generation UnitITLB: Instruction Translation look aside BufferDTLB: Data Translation look aside Buffer

[6]

RECONFIGURABILITY

[6]

DIE PHOTO OF GODSON-3B1500

GODSON 3B1500 8 core ,32 nm , 1.35 GHz ,40 Wat 172.8 GFLOPS Cu-layer high-κ metal-gate (HKMG) 1.14 billion transistors in 182.5mm2 35% power-efficiency improvement

[1,5]

MODIFICATIONS OF THE MEMORY HIERARCHY

last-level cache (LLC) is increased from 4MB to 8MB

a 4-way 128KB private victim cache in each core

low-cost asynchronous FIFO between every core and uncore to isolate cores in voltage and frequency

[1]

MODIFICATION IN I/O Update P2P HyperTransport (HT) from 1

to 2

Memory access speed from DDRIII 800 to DDRII 1200

[1]

CLOCKING SCHEMEglobally asynchronous locally synchronous (GALS)External rclk 33 MHzGlobal clock glck 200 MHzDFS Dynamic Frequency Scaling 1.5 GHzNode Clock 1 GHzDCDL : digital-controlled delay lines

[5]

MEMORY INTERFACE Two 64b 153.6Gb/s on-die termination (ODT) impedance

(60-120Ω range with 5Ω step) Dynamic output slew-rate control dynamic off-chip driver (OCD)

impedance (34-40Ω range with 1Ω step

[1]

HYPERTRANSPORT Bandwidth of 22.4GB/s Up to 2.8Gb/pin/s BER of less than 10^-15 (CDR)simple direct sampling in low-

power mode All-digital DLL-based CDR for high

speed [1]

HPC WITH GODSON-3A

Linpack test results

[4]

REFERENCES [1] Godson-3B1500: A 32nm 1.35GHz 40W 172.8GFLOPS 8-Core

Processor ISSCC 2013 / SESSION 3 / PROCESSORS/ 3.5 [2] Loongson on Wikipedia [3] Godson-2H: a complex low power SOC in 65nm CMOS IEEE 2012 [4] Design and Implementation of BIOS for Godson-3A

Interconnections IEEE 2011 [5] Godson-3B: A 1GHz 40W 8-Core 128GFLOPS Processor in 65nm

CMOS ISSCC 2011 / SESSION 4 / ENTERPRISE PROCESSORS & COMPONENTS / 4.4

[6] GODSON-3:A SCALABLE MULTICORE RISCPROCESSORWITHX86 EMULATION IEEE 2009

[7] Microarchitecture and Performance Analysis of Godson-2 SMT Processor IEEE 2006