Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment...
-
Upload
lillian-mathews -
Category
Documents
-
view
213 -
download
1
Transcript of Kris Gaj Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment...
Kris Gaj
Office hours: Monday, 6:00-7:00 PM Tuesday, Thursday, 7:30-8:30 PM, and by appointment
Research and teaching interests:• cryptography• computer arithmetic• VLSI design and testing
Contact:Science & Technology II, room 223
(703) 993-1575
ECE 645
Part of:
MS in EE
MS in CpE
Digital Systems Design – pre-approved courseOther concentration areas – elective course
Certificate in VLSI Design/Manufacturing
PhD in IT
PhD in ECE
Spring 2007 Enrollment as of January 22, 2006
MS in CpE7
MS in EE4
PhD in CS1
Non-Degree4
My general area of interest is…
I want to specializeprimarily in…
VLSI
Digital Systems Design
ASICs & FPGAs
VHDL/Verilog
CAD Tools
Reconfigurable Computing
Microelectronics
VLSI Fabrication
Nanoelectronics
CAD tools & Design Automation
Hardware description languages
FPGAs & Reconfigurable computing
Computer arithmetic
Front-end ASICdesign (algorithmic downtogate level)
Back-end ASIC design (transistor downto devicelevel)
Analog & Mixed Circuit Design
VLSI Fabrication
Micro- and Nanoelectronics
Semiconductor Devices
MS CpEDigital Systems Design
MS EEMicroelectronics
Recommendeddegree &
concentration
algorithmic
register-transfer
gate
transistor
layout
devices
ComputerArithmetic
Introduction to VHDL
DigitalIntegrated
CircuitsMixed
Signals VLSI
VLSI Test Concepts
ECE545
ECE645
ECE 586
ECE 699
ECE682
ECE684ECE 584Semiconductor
Device Fundamentals
ECE681
VLSI Design Automation
MOS Device Electronics
ECE745
ECE 699ULSI
MicroelectronicsNano-
electronics
ECE 587
AnalogIntegrated
Circuits
CpEcore
EEcore
MS CpE: DIGITAL SYSTEMS DESIGN
Concentration advisors: Kris Gaj, Ken Hintz, David Hwang
1. ECE 545 Introduction to VHDL– K. Gaj, D. Hwang, project, VHDL, Aldec/Synplicity/Xilinx and Synopsys Design Analyzer/PrimeTime
2. ECE 645 Computer Arithmetic: HW and SW Implementation– K. Gaj, project, VHDL, Aldec/Synplicity/Xilinx and Synopsys Design Analyzer/PrimeTime
3. ECE 681 VLSI Design Automation – T. Storey, project/lab, back-end design with Synopsys tools
4. ECE 586 Digital Integrated Circuits – D. Ioannou
Prerequisites
Permission of the instructor, granted assuming that you know
VHDL or Verilog, High level programminglanguage(preferably C)
ECE 545 Introduction to VHDL
or
Course web page
ECE web page Courses Course web pages ECE 645
http://teal.gmu.edu/courses/ECE645/index.htm
Computer Arithmetic
Lecture Project
Project 1 20 %Project 2 30 %
Homework 10 %Midterm exam 1 (in class) 20 %Midterm exam 2 (take-home) 20 %
Advanced digital circuit design course covering
• addition and subtraction• multiplication• division and modular reduction• exponentiation
Efficient
Integersunsigned and signed
Real numbers• fixed point• single and double precision floating point
Elementsof the Galoisfield GF(2n)• polynomial base
Lecture topics (1)
1. Applications of computer arithmetic algorithms
2. Number representation
• Unsigned Integers• Signed Integers• Fixed-point real numbers• Floating-point real numbers• Elements of the Galois Field GF(2n)
INTRODUCTION
1. Basic addition, subtraction, and counting
2. Carry-lookahead, carry-select, and hybrid adders
3. Adders based on Parallel Prefix Networks
ADDITION AND SUBTRACTION
MULTIOPERAND ADDITION
1. Carry-save adders
2. Wallace and Dadda Trees
3. Adding multiple signed numbers
MULTIPLICATION
1. Tree and array multipliers
2. Sequential multipliers
3. Multiplication of signed numbers and squaring
DIVISION
1. Basic restoring and non-restoring sequential dividers
2. SRT and high-radix dividers
3. Array dividers
LONG INTEGER ARITHMETIC
1. Modular Exponentiation
2. Multi-Precision Arithmetic in Software
FLOATING POINT AND
GALOIS FIELD ARITHMETIC
1. Floating-point units
2. Galois Field GF(2n) units
• University of California, Santa Barbara, Behrooz Parhami, ECE252B: Computer Arithmetic.
• University of Massachusetts, Amherst, Israel Koren, ECE666: Digital Computer Arithmetic
• Lehigh University, Michael Schulte, ECE496: High-Speed Computer Arithmetic.
• Worcester Polytechnic Institute, Berk Sunar, EE-579 V Computer Arithmetic Circuits.
• Stanford University, Michael Flynn, EE486: Advanced Computer Arithmetic.
• University of California, Davies, Vojin Oklobdzija, ECE278: Computer Arithmetic for Digital Implementation.
Similar courses at other universities
New in this course
• real-life project based on VHDL or Verilog HDL
• operations in the Galois Field (with application in cryptography and communications)
Possible topics for a Scholarly Paper or Research Project
for the CpE & EE students
Advanced Computer Arithmetic
Square rootExponential and logarithmic functionsTrigonometric functionsHyperbolic functions
Fault-Tolerant ArithmeticLow-Power ArithmeticHigh-Throughput Arithmetic
Literature (1)
Required textbook:
Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design, Oxford University Press, 2000.
Milos D. Ercegovac and Tomas Lang Digital Arithmetic, Morgan Kaufmann Publishers, 2004.
Isreal Koren, Computer Arithmetic Algorithms, 2nd edition, A. K. Peters, Natick, MA, 2002.
Recommended textbooks:
Literature (2)
1. Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998.
2. Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004.
VHDL books (used in ECE 545 in Fall 2005)
Literature (3)
Supplementary books:
1. E. E. Swartzlander, Jr., Computer Arithmetic, vols. I and II, IEEE Computer Society Press, 1990. 2. Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone, Handbook of Applied Cryptology, Chapter 14, Efficient Implementation, CRC Press, Inc., 1998. 3. Christof Paar, Efficient VLSI Architectures for Bit Parallel Computation in Galois Fields, VDI Verlag, 1994.
Literature (3)
Proceedings of conferences ARITH - International Symposium on Computer Arithmetic ASIL - Asilomar Conference on Signals, Systems, and Computers ICCD - International Conference on Computer Design CHES - Workshop on Cryptographic Hardware and Embedded Systems
Journals and periodicals IEEE Transactions on Computers, in particular special issues on computer arithmetic: 8/70, 6/73, 7/77, 4/83, 8/90, 8/92, 8/94, 7/00, 3/05. IEEE Transactions on Circuits and Systems IEEE Transactions on Very Large Scale Integration IEE Proceedings: Computer and Digital Techniques Journal of VLSI Signal Processing
Homework
• reading assignments (main textbook + articles)
• analysis of hardware and software algorithms and implementations
• design of small hardware units using VHDL or Verilog
Optional assignments
Possibility of trading
analysis vs. design vs. coding
Midterm exams
Exam 1 - 2 hrs 30 minutes, in class multiple choice + short problems
Exam 2 – 48 hrs, take-home conceptual questions, analysis and design of arithmetic units using VHDL or Verilog HDL
Practice exams on the web
Exam 1 - Monday, March 26Exam 2 - Saturday-Sunday, May 5-6
Tentative days of exams:
Project (1)Project I (20% of grade)
Design and comparative analysis of fast adders (several hundred bits long)
Final report dueMonday, March 19
Optimization criteria:• minimum latency• maximum throughput• minimum area• minimum product latency · area• maximum ratio throughput/area• scalability
Similar for all students Done individually
Project II (30% of grade)
Fast • multiplication• squaring• division• modular reduction, or • modular exponentiation
Project (2)
or
Fast • addition or • multiplication
Long unsigned or signed integers
Floating-point numbers
Written report & oral presentationMonday, May 14
• Real life application
• Requirements derived from the analysis of the application
• Typically both hardware and software design
• Several project topics proposed on the web
• You can choose project topic by yourself
• Can be done in a group of 1-3 students
Project II (rules)
• Cooperation (but not exchange of codes) between teams is encouraged
• Every team works on a slightly different problem
• Project topics should be more complex for larger teams
Project II (rules)
Project
Hardware Software
VHDL (or Verilog) code
Latency and/or throughput
Area
High level language(C preferred)
Execution time
Memory requirements
Scalability Scalability
Degrees of freedom and possible trade-offs
speed area
power testability
ECE 645
ECE 682 ECE 586, 681
speed
area
latency
throughput
Degrees of freedom and possible trade-offs
Timing parameters
definition units pipelining
latency
throughput
delay
clock period
clock frequency
time inputoutput
#output bits/time unit
time pointpoint
rising edge rising edgeof clock
1clock period
ns
ns
Mbits/s
ns
MHz
bad
good
good
good
Project technologies
semi-custom Application Specific Integrated Circuits and Field Programmable Gate Arrays
Levels of design description
Algorithmic level
Register Transfer Level
Logic (gate) level
Circuit (transistor) level
Physical (layout) level
Level of description
most suitable for synthesis
Register Transfer Logic (RTL) Design Description
Combinational Logic
Combinational Logic
…
Clock
Registers
CAD software available at GMU (1)
• Aldec Active-HDL (under Windows)
• ModelSim Xilinx Edition III (under Windows)
• available in the FPGA Lab, S&T II, room 203• limited version available for free for individual use at home as a part of Xilinx WebPACK
• available in the FPGA Lab, S&T II, room 203
VHDL simulators
• student edition can be purchased on an individual basis ($59.95 + S&H)
CAD software available at GMU (2)
• Synplicity Synplify Pro (under Windows)
• Synopsys Design Compiler and PrimeTime (under Unix)• available from all PCs in the ECE educational labs using an X-terminal emulator• available remotely from home using a fast Internet connection
• available in the FPGA Lab, S&T II, room 203• available for free as a part of WebPACK
Tools used for logic synthesis
• Xilinx XST (under Windows)
FPGA synthesis
ASIC synthesis
CAD software available at GMU (3)
• Xilinx ISE (under Windows)
• available in the FPGA Lab, S&T II, room 203
Tools used for implementation (mapping, placing & routing) in the FPGA technology
• Xilinx WebPACK (under Windows)
• limited version available for free for individual use at home as a part of Xilinx WebPACK
How to learn VHDL for synthesisby yourself?
• Lecture slides for ECE 545 from Fall 2005
• Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998.
• Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004.
• Individual or small-group hands-on sessions with the TA
• Practice, Practice, Practice!!!
Testbench
testbench
design entity
Architecture 1 Architecture 2 Architecture N. . . .
Non-synthesizable
Synthesizable
Representative
Inputs
HDL Design
(VHDL or Verilog)
Reference Model
(C or MAGMA ) expected results
Testbench
actual results= ?
Hardware Design Verification
Primary applications (1)
Execution units of general purpose microprocessors
Integer units Floating point units
Integers(8, 16, 32, 64 bits)
Real numbers (32, 64 bits)
Primary applications (2)
Digital signal and digital image processing
Real or complex numbers(fixed-point or floating point)
e.g., digital filters Discrete Fourier Transform Discrete Hilbert Transform
General purpose DSP processors
Specialized circuits
Primary applications (3)
Coding
Elements of the Galois fields GF(2n) (4-64 bits)
Error detection codesError correcting codes
Secret-key (Symmetric) Cryptosystems
key of Alice and Bob - KABkey of Alice and Bob - KAB
Alice Bob
Network
Encryption Decryption
Primary applications (4)
Cryptography
Integers(16, 32 bits)
Secret key cryptography
IDEA, RC6, Mars Twofish, Rijndael
Elements of the Galois field GF(2n) (4, 8 bits)
RC6
MARS
Twofish
MUL32, 2 x ROL32,S-box 9x32
Mainoperations
Auxiliaryoperations
XOR,ADD/SUB32
2 x SQR32,2 x ROL32
XOR,ADD/SUB32
96 S-box 4x4,24 MUL GF(28)
XORADD32
Rijndael
Serpent 8 x 32 S-box 4x4
XOR
16 S-box 8x824 MUL GF(28)
XOR
Public Key (Asymmetric) Cryptosystems
Public key of Bob - KBPrivate key of Bob - kB
Alice Bob
Network
Encryption Decryption
RSA as a trap-door one-way function
M C = f(M) = Me mod N C
M = f-1(C) = Cd mod N
PUBLIC KEY
PRIVATE KEY
N = P Q P, Q - large prime numbers
e d 1 mod ((P-1)(Q-1))
RSA keys
PUBLIC KEY PRIVATE KEY
{ e, N } { d, P, Q }
N = P Q
e d 1 mod ((P-1)(Q-1))
P, Q - large prime numbers
Primary applications (5)
Cryptography
Long integers(1000-16,000 bits)
Public key cryptography
RSA, DSA,Diffie-Hellman
Elliptic Curve Cryptosystems
Elements of the Galois field GF(2n) (150-500 bits)
Primary applications (5)
Cipher Breaking
Public key cryptography
RSA PUBLIC KEY RSA PRIVATE KEY
{ e, N } { d, P, Q }
N = P Q P, Q
e d 1 mod ((P-1)(Q-1))
Estimation of RSA Security Inc. regarding the number and memory of PCs
necessary to break RSA-1024
Attack time: 1 year
Single machine: PC, 500 MHz, 170 GB RAM
Number of machines: 342,000,000
Factoring 1024-bit RSA keysusing Number Field Sieve (NFS)
Polynomial Selection
Linear Algebra
Square Root
Relation Collection
Sieving
Norm Factoring/Cofactoring
200 bit
numbers & 350 bit ECM
p-1 methodPollard rho
Sashisu Bajracharya
RamakrishnaBachimanchi
Comparison among technologies
SRC COPACOBANA
Microprocessors ASICsFPGAs
FPGAs vs. Microprocessors
Spartan3s5000Virtex2v6000
Pentium4 2.8GHz
637
869
635
857
315
435
80 7640
rho p-1 ECM
10.8x
7.8x
11.3x
8.4x
10.8x
7.9x
Local Memory
Global Memory
Rho in an ASIC 130 nm
51x
ASIC 130 nm vs. Virtex II 6000 – rho (24 units)
19.80 mm
19.6
8 m
m
2.7 mm
2.82 mm
Area of Virtex II 6000 (estimation by R.J. Lim Fong,
MS Thesis, VPI, 2004)
Area of an ASIC with equivalent functionality
Number of computations per second using the same chip area
88,405
869
21,739
435
101x
50x
Virtex2v6000 FPGA
130 nm ASIC library
rho ECM
Cofactorization Unit
interestingComputer Arithmeticproject
Famous computer arithmeticbugs and flaws
Learn to deal with approximations
• In digital arithmetic one has to come to grips with approximation and questions like:– When is approximation good enough
– What margin of error is acceptable
• Be aware of the applications you are designing the arithmetic circuit or program for
• Analyze the implications of your approximation
Calculators
2.....u =
10 times
v = 21/1024 = 1.000 677 131= 1.000 677 131
x = (((u2)2)…)2 = 1.999 999 963
10 times
x’ = u1024 = 1.999 999 973
y = (((v2)2)…)2 = 1.999 999 983
10 times
y’ = v1024 = 1.999 999 994
Hidden digits in the internal representation of numbersDifferent algorithms give slightly different results
Very good accuracy
Consequences of bad approximations
Example: Failure of Patriot Missile (1991 Feb. 25)
Source http://www.math.psu.edu/dna/455.f96/disasters.html
American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept incoming Iraqi Scud missile The Scud struck an American Army barracks, killing 28
Cause, per GAO/IMTEC-92-26 report: “software problem” (inaccurate calculation of the time since boot)
Specifics of the problem: time in tenths of second as measured by the system’s internal clock was multiplied by 1/10 to get the time in seconds. Internal registers were 24 bits wide 1/10 = 0.0001 1001 1001 1001 1001 100 (chopped to 24 b) Error 0.1100 1100 2–23 9.5 10–8
Error in 100-hr operation period9.5 10–8 100 60 60 10 = 0.34 sDistance traveled by Scud = (0.34 s) (1676 m/s) 570 m
This put the Scud outside the Patriot’s “range gate” Ironically, the fact that the bad time calculation had been improved in some (but not all) code parts contributed to the problem, since it meant that inaccuracies did not cancel out
Example: Explosion of Ariane Rocket (1996 June 4)
Source http://www.math.psu.edu/dna/455.f96/disasters.html
Unmanned Ariane 5 rocket launched by the European Space Agency veered off its flight path, broke up, and exploded only 30 seconds after lift-off (altitude of 3700 m)
The $500 million rocket (with cargo) was on its 1st voyage after a decade of development costing $7 billion
Cause: “software error in the inertial reference system”
Specifics of the problem: a 64 bit floating point number relating to the horizontal velocity of the rocket was being converted to a 16 bit signed integer
An SRI* software exception arose during conversion because the 64-bit floating point number had a value greater than what could be represented by a 16-bit signed integer (max 32 767)
Consequences of bad approximations
Pentium bug (1)October 1994
Thomas Nicely, Lynchburg Collage, Virginiafinds an error in his computer calculations, and tracesit back to the Pentium processor
Tim Coe, Vitesse Semiconductorpresents an example with the worst-case error
c = 4 195 835/3 145 727
Pentium = 1.333 739 06...Correct result = 1.333 820 44...
November 7, 1994
Late 1994
First press announcement, Electronic Engineering Times
Pentium bug (2)
Intel admits “subtle flaw”
Intel’s white paper about the bug and its possible consequences
Intel - average spreadsheet user affected once in 27,000 yearsIBM - average spreadsheet user affected once every 24 days
Replacements based on customer needs
Announcement of no-question-asked replacements
November 30, 1994
December 20, 1994
Pentium bug (3)
Error traced back to the look-up table used bythe radix-4 SRT division algorithm
2048 cells, 1066 non-zero values {-2, -1, 1, 2}
5 non-zero values not downloaded correctly to the lookup table due to an error in the C script