Seminario utovrm
-
Upload
dario-pennisi -
Category
Engineering
-
view
93 -
download
0
Transcript of Seminario utovrm
Agenda
Part 1Microprocessor and silicon technology EvolutionMemoriesBus architecturesSystem On Chip
Part 2GPUFPGAArchitectural designCollaboration toolsOpen Source15/04/2023
Electronic System DesignArchitectural design• Break down the design in subsystems• Define subsystems functionality• Define interfaces (Busses, protocols, APIs etc)
Methodologies• Implement collaboration tools for large, dislocated
teams• Define version control strategies• Define verification methodologies
Experience is keyNEVER reinvent the wheel
Introduction
15/04/2023
History
In the beginning…
Discrete components (resistors, transistors)Multiple transistors to form a single gateEven a simple counter required multiple boards
15/04/2023
History
The integrated circuit (1950-1960)Invented in 1950Technology available in 1960Multiple Logic gates in a single device (chip)
15/04/2023
Notes
• Great inventions are visionary
• Not necessarily an invention is feasible immediately
• Designer shall foresee technological breakthroughs
15/04/2023
History
Intel 8086 (1978)29K transistors, 3 µm10 MHzFirst “true” 16 bit microprocessorRequired several external peripherals• Interrupt controller• DMA• Timer
Backwards compatible with 8080 (1974)• First “usable” microprocessor, 4500 transistors,
10 µm• Required +12V and -5V supplies• 2 MHz
15/04/2023
Notes
• First microprocessors are 40 years old
• Latest x86 core i7 is backwards compatible with something from 36 years ago!
15/04/2023
History
Programmable Logic Array (1977)Fuse basedCan implement any combinatorial logicProgrammable at production time
15/04/2023
History
PLD (1983)More complex than PLA, reprogrammableIntroduce macrocell concept• Small PLA with a Flip Flop• External interconnect
15/04/2023
History
LCA (1985)Programmable sea of gates, 1µmRAM based with external configuration memoryUp to 7600 logic gates (484 CLBs)
15/04/2023
Notes
• Moving from OTP to reprogrammable made big difference
• Programmable logic is key when ASSPs are not available
• Same design flow as ASICs
15/04/2023
History
Dynamic RAMInvented in 1966, First useful device in 1973Drastically reduces transistor count from SRAMRequires refreshMultiplexed addressing• Reduces access time• Increases latency
Banking• Read/write multiple rows
at the same time
15/04/2023
History
DRAMs or HDDs can have very long access times
Random access to high latency devices kills performanceAccess to adjacent data requires very low latency
Cache memoryWhenever random data is requested, cache stores adjacent locations in «cache lines»Subsequent accesses to data in cache has low latencyMultiple levels of cache improve performance15/04/2023
Notes
• Thinking outside the box allows revolutionary solutions
• Tradeoffs are acceptable when benefits prevail
• New technologies limitations stimulate more innovation
15/04/2023
History
Intel 80486 (1989)1.2M transistors, 1µm50 MHz / 40 MIPSFirst to embed cache (16KB)• Reduce DRAM latency penalty
First to embed FPU32 bit data bus
15/04/2023
Notes
• 10x technology shrink in 15 years (101µm)
• Embedding of widely used external coprocessors
15/04/2023
History
Pentium (1993)3.3M transistors, 800nm66MHz, no direct connection to memoryIn 1996 introduced MMX
Distributed architecturesMicroprocessorMemory interface bridgePeripheral and bus bridgeExternal superIO
15/04/2023
Notes
• Processor doesn’t interface directly with memory anymore
• Northbridge routes processor accesses among memory and high speed busses abstracting them
• Peripherals get integrated in Southbridge
15/04/2023
History
Parallel bus topology (ISA, PCI, AGP)Separate or multiplexed address/dataLimited by signaling technology• Few MHz with TTL• Up to 150 MHz with LVCMOS
Dual and Quad data rate to further improve bandwidth (mainly on memories)• Some use of differential signaling
15/04/2023
History
Peripheral Component Interconnect
Configuration space• Automatic card detection and
configuration• Extended card information• Standardized register set
Dynamic device address mapping• No more conflicts among multiple
cards on the same bus
Introduces bursting
15/04/2023
Serial busses (PCIe, USB, HDMI, etc)Smaller number of tracesVery low voltage differential signalingClocked or self clockingMulti Gbit per lanePCIe• 1.0 – 2GBit/sec per lane• 2.0 – 4GBit/sec per lane• 3.0 – 8Gbit/sec per lane
USB• 1.0 – 12 Mbit/sec• 2.0 – 480 Mbit/sec• 3.0 – 5GBit/sec
History
15/04/2023
History
15/04/2023
PCI ExpressIntroduces layered, packetized busStar connection rather than one to manyAllows tree configuration via PCI-PCI bridgesScalable bandwidth • Pin compatible connectors from 1 to 16 lanes• Increasing bit rate at each generation
Overhead• Protocol & flow control • Encoding
– 20% on Gen1&2 (8b10b)– 1.54% on Gen3 (128/130)
Notes
• Communication between subsystems is key
• Bandwidth can be increased without brute force
• High speed, low voltage serial is faster and more energy efficient than parallel LVCMOS
15/04/2023
System on Chip
System on ChipMicroprocessor plus multiple peripherals and memory in a single chipIP blocks from multiple vendors are integrated in a single device
Typical Smartphone SoCProcessor from ARMGPU from AdrenoPeripherals from Synopsysetc
15/04/2023
Interconnect
Interconnection between IP blocksEnsure interoperabilityMaximize performance
Address system complexityMultiple mastersLocked transfersCache coherency
TestabilityMulticore debuggingSystem tracePerformance counters
15/04/2023
AXIInterconnect processors and high performance peripheralsMultilayer matrix configuration
AXI-StreamStreaming interface for packetized data flowMultiple data widths within same interconnectBackpressure support
APBInterconnect low speed peripherals
ATBAdd tracing capability to any peripheral
Interconnect
15/04/2023
Notes
• Single chip integrates all peripherals except memories
• IP blocks from different vendors
• Interconnect standardization (AMBA)
• Test and debugging challenges
15/04/2023
Notes
• Big - Little Architecture
• Heterogeneous processors for different tasks
• Codecs implemented in software
• Application specific interfaces
15/04/2023
Bottlenecks
Memory LatencyDDR clock speeds exceed multiple GHzColumn access time in the order of 5nsRow access time still in the order of 50nsCan be worked around with multiple levels of Cache memory
BandwidthClock frequency is limited by technologyLarge busses are expensiveCan be worked around with distributed memory
15/04/2023
GPUs
Graphics Processing Unit Started as dedicated vertex processorsEvolved thowards SW programmable shadersNow used for massively parallel computation• OpenCL• Cuda
Massively parallelHundreds of parallel processorsMultiple chips can be teamed for increased performance
15/04/2023
Today
GPU ArchitectureEach Core has high speed memoryCores grouped in clusterseach cluster has local memoryEach cluster can access device memoryHost memory can be transferred to device memory via DMA
Different levels of memory latencyEach core in a cluster executes the same code
15/04/2023
Today
GK110 Kepler (Nvidia) 7G transistors (28nm)1.5MB on chip L2 cache 15 SMX units (64KB RAM each)• 192 single precision cores• 64 double precision cores• 32 Special function units + 32 load/store units
External DDR56x64 bit memory controllersUp to 6 GHz clock speed
15/04/2023
Notes
• Peripherals may be more complex than main processor
• Eliminating bottlenecks by architecture, not just brute force
• Transforming dedicated HW in SW programmable devices creates value
15/04/2023
Today
Soc FPGAHigh gate count FPGA+Dual core Cortex A9• Lower integration than ASSPs• Higher Flexibility than ASSPs
Direct interconnection between FPGA and Processor• Possibility to accelerate software with FPGA IP• Lower system cost implementing in software less
critical Ips• On the fly reprogramming to repurpose hardware
on demand
15/04/2023
Trends
High density FPGA+SoC14 nm trigateEmbedded 64 bit quad core Cortex A531 GHz system speeds56 GBps transceiversSupport for 2.7 TBps HMCSupport for 1.3 TBps DDR4
15/04/2023
Notes
• Integration of hard IP with programmable fabric
• ASIC design cost skyrocketing is favoring FPGAs
• Large library of IP cores (including open source)
• FPGA to accelerate critical algorithms
15/04/2023
Trends
Silicon feature size reaching single atom level
Quantum effects not negligible anymoreLight sources for lithography unavailable
New technologies to increase density
15/04/2023
Trends
15/04/2023
Stacked die, MCMMultiple dies in a single packageWired interconnect
3d interconnect Interposersthrough silicon vias
Trends
Hybrid memory cubeIntegrate memory + controller in a single packageOptimize memory performance (speed/power)Connect multiple concurrent processors to a single device
15/04/2023
Summary
• Chip density is increasing regardless of physical limits
• Systems are gradually being condensed to a single chip
• Chips requiring multiple technologies are manufactured with MCM or 3D processes
• System components from multiple vendors are integrated in single chips
• Software IS a system component
15/04/2023
What we learnt
Summary
• Whenever Moore’s law is hitting a wall breakthroughs keep it going
• Thinking outside the box is vital for innovation
• System design requires knowledge of leading edge technologies
• System optimization requires in depth knowledge IP block functionality at all levels
15/04/2023
Theory
Electronic System DesignMethodologies
• Implement collaboration/Knowledge management tools• Define version control strategies• Define verification methodologies
Architectural design• Break down the design in subsystems• Define subsystems functionality• Define interfaces (Busses, protocols, APIs etc)
Subsystem implementation• Unit test design• RTL coding• Simulation• Synthesis and timing closure (ASIC/FPGA)
15/04/2023
Methodologies
System design requires organizationEven small groups can have communication issuesEven a single developer can miss information
Collaboration/Knowledge management tools
Requirement and bug trackingRevision controlProject planningBuild automation
15/04/2023
Methodologies
Requirement trackingKeep track of specificationsClearly define dependenciesHelp partitioning in smaller tasks
Bug TrackingTrack issues and their solutions• Solutions to old problems can shorten new ones• Knowledge of issues can prevent repeating
mistakes
Clearly identify which changes have been adopted for a specific issue• Regression testing
15/04/2023
Methodologies – Version Control
Version ControlKeep track of modifications and their reasons• Always comment your commits• Possibly reference bug tracker
Allow multiple developers to work concurrentlyBranch/tag• Branching allows separate development
environments for each developer• Developers can commit broken code in branches• Merge branches only when code is reliable• Tag when code is stable or on milestones
15/04/2023
Methodologies - verification
VerificationTesting is crucial to ensure qualityEach IP shall include its test unitSystems shall have test benchesTest cases shall be carefully plannedCoverage shall be known
Coding tests can take more time than coding IP blockPlan testing before coding
Better specifications and clearer requirements
Quality is NOT a cost15/04/2023
Methodologies - Documentation
Always document your work!Sharing knowledge improves teamworkDocumentation adds value to your workYou can’t remember everything
IssuesSynchronization with artifact versionsCompleteness
15/04/2023
Methodologies - Documentation
DoxygenDocument your code within the codeAutomatic hierarchy documentationCan be used with most programming languages• Can be extended to any file with plugins
Can generate graphs with dot pluginMultiple outputs (PDF, Word, HTML, etc)Automatic generation always in sync with code
15/04/2023
Architectural design
Before you start…Search for existing solutions• Literature• Patents• Open source
List requirements• Define input and outputs• Clearly understand criticalities
List use cases• Define what resources are required for each
scenario
15/04/2023
Architectural design
Partition design in independent unitsSmall• Easy to maintain• Simple to understand
Reusable• Generalize a problem whenever possible• Create a library of tested, robust building blocks
Documented• Possibly use self documentation tools• Test bench with use cases
15/04/2023
Architectural design - Interfaces
Always try to use standard interfacesBlocks can be reused more easilyUnderstand implications of an interface architecture
If non standard interface is required…Define a standard (and document it) Check it against known use casesExplore interface weak points and benefits
15/04/2023
Open Source
BenefitsCollaborative designimproves quality, stability through peer reviewHuge code base for software and hardware IP
DrawbacksHeterogeneous code styles and interfacesNo warranty on quality/functionalityLimited support from community
15/04/2023
Open Source
Business modelsOpen Source libraries and interfaces• Company releases parts of code to community• Community improves code functionality and
reliability• Establish trust with customers and partners
Open Source applications and platforms• Sell support and customization services• Sell HW products• Gain visibility and business opportunities• Possibility of mixed Open/Closed source approach
15/04/2023