of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2...

21
4/27/2019 1 Digital Logic and Design (Course Code: EE222) Lecture 3334: Memory contd.. Programmable Logic Devices Indian Institute of Technology Jodhpur, Year 20182019 Programmable Logic Devices Course Instructor: Shree Prakash Tiwari Email: [email protected] Office: 210, Phone: 02912441356 Webpage: http://home.iitj.ac.in/~sptiwari/ Course related documents will be uploaded on http://home.iitj.ac.in/~sptiwari/DLD/ 1 Note: The information provided in the slides are taken form text books Digital Electronics (including Mano & Ciletti), and various other resources from internet, for teaching/academic use only Programmable Logic Overview ° Programmable logic offers designers opportunity to customize chips ° Programmable logic devices have a fixed logic structure structure ° Programmable array logic contain AND-OR circuits First introduced in early 1980’s ° Field programmable gate arrays (FPGAs) contain small blocks that implement truth tables First introduced in 1985 (Xilinx Corporation) ° Software used to convert user designs to programming information

Transcript of of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2...

Page 1: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

1

Digital Logic and Design (Course Code: EE222)

Lecture 33‐34: Memory contd..Programmable Logic Devices

Indian Institute of Technology Jodhpur, Year 2018‐2019

Programmable Logic Devices

Course Instructor: Shree Prakash TiwariEmail: [email protected]

Office: 210, Phone: 0291‐244‐1356

Webpage: http://home.iitj.ac.in/~sptiwari/

Course related documents will be uploaded on  http://home.iitj.ac.in/~sptiwari/DLD/

1

Note: The information provided in the slides are taken form text books Digital Electronics (including Mano & Ciletti), and various other resources from internet, for teaching/academic use only

Programmable Logic Overview

° Programmable logic offers designers opportunity to customize chips

° Programmable logic devices have a fixed logic structurestructure

° Programmable array logic contain AND-OR circuits• First introduced in early 1980’s

° Field programmable gate arrays (FPGAs) contain small blocks that implement truth tables

• First introduced in 1985 (Xilinx Corporation)

° Software used to convert user designs to programming information

Page 2: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

2

Design Implementation

• Chip creation is a long and difficult process

• Millions of dollars required to create custom silicon

- Simulation, synthesis, fabrication (lots of jobs for engineers)

Programmable Logic Design

° “Generic” chip created and then customized by designer

° Programming information usedLik ROM• Like a ROM

° Analogy – sign making• Custom sign – more expensive, customized by manufacturer, difficult

to change

• Sign built by consumer from individual letters – less expensive, not quite as nice, easier to change (remix letters)

Page 3: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

3

Programmable Logic Devices

Programmable Logic Arrays (PLA)g g y ( )

Programmable Array Logic (PAL)

Simple Programmable Logic Device (SPLD)

Complex Programmable Logic Device (CPLD)

Field Programmable Gate Array (FPGA)

PLD Summary

Page 4: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

4

PLA Example

Programmable Array Logic

• Implements sum-of-products expressions

• Four external inputs (and complements)

x(and complements)

• Feedback path from output F1

• Product term connections made via switches

x

Page 5: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

5

Programmable Array Logic

• Consider implementing the following expression

- I1 I2 I3 + I2‘ I3‘ I4 + I1 I4 = F1

xx x x

x xxx

1 2 3 2 3 4 1 4 1

• Note that only functions of up to three product terms can be implemented

- Larger functions need to be chained together via the feedback pathp

Reconfigurable Hardware

Logic Element

ABC

OutAND

• Each logic element operates on four one-bit inputs.

• Output is one data bit.

CD

A . B . C . D = out

• Can perform any Boolean function of four inputs

Page 6: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

6

Field-Programmable Gate Array

LE LE LE LE

Logic Element Tracks

LE LE

LE LE

LE LE

LE LE

• Each logic element outputs one data bit.

• Interconnect programmable between elements.

• Interconnect tracks grouped into channels.

FPGA Architecture Issues

LogicElement

• Need to explore architectural issues.

• How much functionality should go in a logic element?

• How many routing tracks per channel?

• Switch “population”?

Page 7: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

7

Translating a Design to an FPGA

C program

.

.

Circuit

AB

+ C

Array

• CAD to translate circuit from text description to physical implementation well understood.

• CAD to translate from C program to circuit not well understood.

• Very difficult for application designers to successfully write high-

C = A+B.

B

y pp g y gperformance applications

Need for design automation!

Circuit Compilation

1. Technology Mapping

LUT

2. PlacementLUT

? Assign a logical LUT to a physical location.

3. Routing Select wire segmentsAnd switches forInterconnection.

Page 8: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

8

Two Bit Adder

FA

A B

Co Ci

Made of Full Adders

A+B = D

o i

S

Logic synthesis tool reduces circuit toSOP form

S = ABCi + A’B’Ci + AB’Ci’ + A’BCi’

Co = ABCi + A’BCi + AB’Ci + ABCi’

LUT CoCiBA

LUT SCiBA

Dynamic Reconfiguration

L L

L L

• What if I want to exchange part of the design in the device with another piece?

• Need to create architectures and software to incrementally change designs.

• Effectively a “configuration cache”

Page 9: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

9

Field Programmable Arrays Capabilities

° Dominant digital design implementation

° Ability to re-configure FPGA to implement any digital logic functionlogic function

• Partial re-configuration allows a portion of the FPGA to be continuously running while another portion is being re-configured

° FPGAs also contain analog circuitry features including a programmable slew rate and drive strength, differential comparators on I/O designed to be connected to differential signaling channels.

° Mixed-signal FPGAs contains ADCs and DACs with analog signal conditional blocks allowing them to operate as a system-on-chip (SoC)

FPGA Architectures

° Early FPGAs• N x N array of unit cells (CLB + routing)

- Special routing along center axis

° Next Generation FPGAs• M x N unit cells

• Small block RAMs around edges

° More recent FPGAs• Added block RAM arrays

• Added multiplier cores

• Adders processor cores

configurable logic block (CLB)

Page 10: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

10

FPGA Architecture Trends

° Memories• Single & Dual-port RAMS

• FIFO (first-in first-out)

ECC ( ti d )• ECC (error correcting codes)

° Digital Signal Processors• Multipliers

• Accumulators

• Arithmetic Logic Units (ALUs)

° Embedded Processors• Hardcore (dedicated processors)

- Dedicated program and data memories

- Programmable RAM in FPGA can be used in conjunction with the processor to provide program and data memories

• Soft core (synthesized from a HDL)

Basic FPGA Architecture

•More recent FPGA architectures have small block RAM arrays (usually placed in center column), multipliers, processor cores, DSP cores w/ multipliers, and I/O cells along columns for ball grid arrays (BGAs)

Page 11: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

11

FPGA Operation

User writes configuration memory which defines the function of the system. Thisdefines the function of the system. This includes: the connectivity between the CLBs and the I/O cells, the logic to be implemented onto the CLBs, and the I/O blocks.

By changing the data in the configuration memory, the function of the system h ll Thi h i d tchanges as well. This change in data can

be implemented at anytime during FPGA operation (run-time configuration).

Configurable Logic Blocks (CLBs) Architecture

° CLBs consist of:• Look-up Tables (LUT) which implement the entries of a logic

functions truth table- Some FPGAs can use LUTs to implement small Random

Access Memory (RAM)

• Carry and Control Logic- Implements fast arithmetic operations (adders/ subtractors)

- Can be also configured for additional operations (Built-in-Self Test iterative-OR chain)

• Memory Elements- Configurable Flip Flops (FFs)/ Latches( Programmable clock

edges, set/reset, and clock enable)

- These memory elements usually can be configured as shift-y y gregisters

Page 12: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

12

Configurable Logic Blocks

A CLB can contain several slices, which

k i l CLBmake up a single CLB. Xilinx Virtex-5 FPGAs (right) have two slices: SLICEL (logic) and SLICEM (memory).

In addition to the basic CLB architecture, the Virtex-5 contains wide-Virtex 5 contains widefunction MUXs which can implement:- 4:1 MUX using 1 LUT- 8:1 MUX using 2 LUTs- 16:1 MUX using 4 LUTs

Look-up Tables (2:1 MUX Example)

° Configuration memory holds output of truth table entries

° Internal signals connect to control signals of MUXs° Internal signals connect to control signals of MUXs to select a values of the truth tables for any given input signals

Page 13: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

13

LUT Based Ram

° Normal LUT mode performs read operation

° Address decoders with WE generates clock signals to latches for write operation

° Smaller RAMs canSmaller RAMs can be combined to create larger RAMs (up to 64-bit in Virtex-5)

FPGA Programmable Interconnection Network° Horizontal and vertical mesh of wire segments interconnected by programmable

switches called programmable interconnect points (PIPs). These PIPs are implemented using a transmission gate controlled by a memory bits from the configuration memory.

° Consists of global routing connecting PLBs to I/O buffers, non-adjacent PLBs, and other embedded components. Local routing connects PLBs to other adjacent PLBs and PLBs to global routing (done through a switch matrix)

° Several types of PIPs are used• Cross-point = connects vertical or horizontal wire segments allowing turns

• Breakpoint = connects or isolates 2 wire segments

• Decoded MUX = group of 2^n cross-points connected to a single output configure by n configuration bits

• Non-decoded MUX = n wire segments each with a configuration bit (n segments)

• Compound cross-point = 6 Break-point PIPS (can isolate two isolated signal nets)

Page 14: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

14

Progammable Input/Output Cells

° Bi-directional Buffers• Programmable for inputs or outputs

• Tri-state controls bi-directional operation

• Pull-up/down resistors

• FFs/ Latches are used to improve timing issues• FFs/ Latches are used to improve timing issues

- Set-up and hold times

- Clock-to-out delay

° Routing Resources• Connections to core of array

° Programmable I/O voltage and current levels

Boundary Scan Access

FPGA Configuration Interfaces

° Master (Serial or Parallel)• FPGA retrieves configuration from ROM at initial power-up

° Slave (Serial or Parallel)• FPGA configured by an external source (i e microprocessor/• FPGA configured by an external source (i.e microprocessor/

other FPGA)

• Used for dynamic partial re-configuration

° Boundary Scan • 4-wire IEEE standard serial interface used for testing

• Write and read access to configuration memory

• Interfaces to FPGA core internal routing network

Page 15: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

15

Boundary Scan Configuration

Multi-FPGA Emulation Framework to support NoC design and verification (UNLV NSIL)

Developed to test interconnect between chips on PCBchips on PCB

Daisy Chain Configuration

Test Access Point (TAP) controller composed of 16 state FSM

FPGA Configuration Techniques

° Full configuration and readback

• Simple configuration interface

- Automatic internal calculation of frame address

• Larger FPGAs have a longer download time

° Compressed configuration

• Requires multiple frame write capability

- Identical frames of configuration data are written to multiple frame addresses

• Extension of partial re-configuration interface capabilities

- Frame address is much smaller than frame of configuration data

• Reduces download time for initial configuration depending on regularity ofReduces download time for initial configuration depending on regularity of system function and the array percent that is utilized

° Partial re-configuration and readback

• Only change portions of configuration memory with respect to reference design

- Reduces download time for re-configuration

Page 16: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

16

Xilinx Virtex-5 FPGAs

Virtex-5 FPGA Platforms

•Over 320,000 PLBs on the largest Virtex-5

•ExpressFabric interconnect sturcture and 12 levels of metal interconnect allowing implementation of complex logic functions

Five Virtex-5 Platforms1. LX- general logic applications2 LXT- logic with advanced

allowing connections to neighboring PLBs in few hops than Virtex-4

•Each PLB contains 8 LUTs, 8 configurable memory elements (can be configured as RAM/ ROM/ shift register)

•Enhanced DSP functions on 25 x 18-bit multipliers (ability to be cascaded)

2. LXT- logic with advanced serial connectivity

3. SXT-signal processing applications with advanced serial connectivity

4. TXT- high performance systems with double density advanced serial connectivity

5. FXT- high performance multipliers (ability to be cascaded)

•Clock managments contain one PLLC and two managers which can drive global clock buffers and filter jitter (cascaded)

g pembedded systems with advanced serial connectivity

Page 17: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

17

Virtex-5 CLB

A single CLB in Virtex-5 consists of two slices: SLICEL (logic) and SLICEM (memory) EachSLICEL (logic) and SLICEM (memory). Each CLB is connected to a switch matrix which can access to a general routing (global) matrix.

Every slice contains four LUTS, wide function MUXs, carry logic, and configurable memory elements. SLICEM support storing data using distributed RAM and data shifting with 32-bit shift registers

FPGA Design Comparison Virtex-5, Virtex-6, and spartan 6

Virtex-6 CLB have the same setup as Virtex-5 (SLICEL & SLICEM)

Virtex-6 devices add four additional storage elements which can only be configured as edge-triggered D-FFs. The D inputs are driven by the output of the LUTs or bypass slice inputs yp pAX-DX

Page 18: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

18

FPGA Design Comparison Virtex-5, Virtex-6, and spartan 6

Spartan-6 CLB columns are separated into two columns: 1 column for a new SLICEX and 1 column for alternating SLICEL and SLICEM. SLICEX is a basic CLB without any carry logic added

Back to Virtex-5 CLB LUT

° Up to 207, 360 LUTs (6-input) with greater than 13 million configuration bits.

° Can be configured as dual-output 5-input LUTs. In single 6 input LUT O6 is the primary outputsingle 6-input LUT, O6 is the primary output.

Page 19: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

19

Virtex-5 Programmable I/O

The I/O cells in Virtex-5 have output logic blocks (OLOGIC) , input logic blocks (ILOGIC), I/O delays blocks, and a bidirectional I/O buffer.y ,

OLOGIC implements registers to improve system clock-to-output timing and supports single data-rate (SDR) and double data-rate (DDR) reception of data. It can also perform parallel-to-serial conversion of output data (2 & 6 bits) in Serial/De-serializer (SerDes) mode.

ILOGIC i l i iILOGIC implements registers to improve set-up and hold times and support SDR and DDR transmission of data. It can perform serial-to-parallel conversion of input data(2 & 6 bits) when in SerDes mode.Two I/O cells are grouped to form a

single I/O tile. In master/slave mode, two I/O cells in the same I/O tile are connected via dedicated shift routing to support larger data widths.

FPGAs see diminishing benefits with scaling

° 90% of FPGA logic area is programmable interconnect

° Performance and power penalty are direct result of the area (70% Virtex 2)the area (70% Virtex-2)

° Interconnect needs to increase faster than number of gates to keep up (Rents rule)

10%Interconnect

60%16%

14%Logic

Clocking

IOB

Dynamic Power in Virtex-2 (Shang FPGA’02)

Page 20: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

20

3D Integrated Circuits•More functionality in a smaller space extends Moore’s Law•More transistors in a package larger designs•Shorter Interconnects less RC delays better chip performance•Power Decrease shorter wires reduce power consumption by producing less capacitance (also less inductance)•Bandwith large number of vertial vias between layers allow construction ofBandwith large number of vertial vias between layers allow construction of wide bandwidth buses between functional blocks in different layers

3D Integrate Circuit

Metal layers

Device layer 2

Metal layers

Si Substrate

Device layer 1

Metal layers

Page 21: of Technology Digital Logic and Designhome.iitj.ac.in/~sptiwari/DLD/Lecture33_34_DLD.pdf4/27/2019 2 Design Implementation • Chip creation is a long and difficult process • Millions

4/27/2019

21

Summary

° Programmable logic allows for designers to easily create custom designs

° Programmable array logic contains AND-OR structures to implement SOP equationsto implement SOP equations

° FPGAs contain small memories and numerous wires for routing

° Designers create designs in Verilog• Design translated to the chip via software