CPU must:◦ Fetch instructions
(suruhan ambil)◦ Interpret _______ (tafsir
_______)◦ _______ data (_______ data)◦ Process data (proses data)◦ Write data (tulis data)
Registers form the highest level of the memory hierarchy (hierarki ingatan)◦ Small set of high speed storage locations◦ ______ storage for data and control information
Two types of registers◦ User-visible
May be referenced by assembly-level instructions (suruhan paras perhimpunan) and are thus “_______” to the user
◦ Control (kawalan) and _______ registers Used to control the operation of the CPU Most are not visible to the user
General categories based on function◦ General purpose (Serba guna)
Can be assigned a variety of functions Ideally, they are defined _______ to the operations
within the instructions◦ _______
These registers only hold data◦ Address (Alamat)
These registers only hold _______ information Examples: general purpose address registers,
segment pointers, stack pointers, index registers◦ _______ codes (Kod _______)
Visible to the user but values set by the CPU as the result of performing operations
Example code bits: _______, _______, overflow (limpahan)
Bit values are used as the basis for conditional jump instructions (suruhan lompat bersyarat)
Design trade off (tukar ganti) between general purpose and specialized registers◦ General purpose registers _______ flexibility in
instruction design◦ _______ purpose registers permit implicit register
specification in instructions - reduces register field size in an instruction
◦ No clear “best” design approach How many registers are enough?
◦ More registers permit more operands (kendalian) to be held within the CPU - reducing memory bandwidth requirements to some extent
◦ More registers cause an _______ in the field sizes needed to specify registers in an instruction word
◦ Locality of reference may not support too many registers
◦ Most machines use _______registers
How big (wide)?◦ Address registers should be _______ enough to hold the
longest address◦ Data registers should be wide enough to hold most data
types Would not want to use _______-bit registers if the vast
majority of data operations used 16 and 32-bit operands
Related to width of memory _______ bus Concatenate registers together to store longer formats
B-C registers in the 8085 AccA-AccB registers in the 68HC11
These registers are used during the _______, decoding (penyahkodan) and _______ of instructions◦ Many are not visible to the user / programmer◦ Some are visible but can not be (easily) modified
Typical registers◦ _______ counter (PC)
Points to the next instruction to be executed◦ _______ register (IR)
Contains the instruction being executed (most recently)
◦ Memory _______ register (MAR) Contains the address of a location in memory
◦ Memory _______ / _______ register (MBR) Contains a word of data to be written to memory or
the word most recently read◦ Program _______ word(s)
Superset of condition code register Interrupt masks, supervisory modes, etc. Status information
A set of bits Includes Condition Codes _______
◦ Contains the sign of the result of the last arithmetic operation _______
◦ Set when the result is 0 _______
◦ Set if an operation resulted in a carry (addition) into or borrow (subtraction) out of a high-order bit
_______◦ Set if a logical compare result is equality
_______◦ Used to indicate arithmetic overflow
Interrupt enable/disable◦ Used to enable or disable interrupts
Supervisor◦ Indicates whether the CPU is executing in supervisor or user
mode
_______ Cycle◦ May require memory access
to fetch operands◦ _______ addressing requires
more memory accesses◦ Can be thought of as
additional instruction ________
Depends on CPU design In general:
_______◦ PC contains _______ of next instruction◦ Address moved to _______◦ Address placed on address bus◦ Control unit requests memory read◦ Result placed on _______ bus, copied to MBR, then to IR◦ Meanwhile PC _______ by 1
IR is examined If indirect addressing, indirect cycle is _______
◦ Right most N bits of _______ transferred to _______◦ Control unit requests memory _______◦ Result (address of _______) moved to MBR
May take many forms Depends on _______ being executed May include
◦ _______ read/write◦ Input/Output◦ _______ transfers◦ _______ operations
_______ _______ Current PC saved to allow resumption after interrupt Contents of PC copied to MBR Special memory location (e.g. _______ pointer) loaded to
MAR MBR written to _______ PC loaded with address of interrupt handling routine Next instruction (first of _______ handler) can be fetched
Prefetch◦ Fetch accessing main _______◦ Execution usually does not _______ main memory◦ Can fetch next instruction during execution of current
instruction◦ Called instruction _______
Improved Performance◦ But not doubled:
Fetch usually _______ than execution Prefetch more than one instruction?
Any jump or _______ means that prefetched instructions are not the required instructions
◦ Add more _______ to improve performance
The Central Processing Unit (CPU) is the _______ combination (kombinasi lojik) of the _______ _______ _______ (ALU) and the system’s control unit
In this sub-section, we focus on the ALU and its operation◦ Overview of the ALU◦ Data representation (Perwakilan data)◦ Computer Arithmetic and its hardware implementation
The ALU is that part of the computer that actually performs _______ and _______ operations on data
All other elements of the computer system are there mainly to bring _______ to the ALU for processing or to take _______ from the ALU
Registers are used as _______ and _______ for most ALU operations
In early machines, _______ and _______ determined the overall structure of the CPU and its ALU◦ Result was that machines were built around a single
register, known as the __________ (penumpuk)◦ The __________ was used in almost all ALU related
_________
The _______ and _______of the CPU and the ALU is improved through increases in the complexity of the hardware◦ Use _______ register sets to store operands, addresses
and results◦ _______ the capabilities of the ALU◦ Use special hardware to support _______ of execution
between points in a program◦ _______ functional units within the ALU to permit
concurrent operations Problem: design a minimal cost yet fully functional ALU
◦ What building block components would be included?
Solution:◦ Only 2 basic _______ are required to produce a fully
functional ALU A bit-wide _______ _______ unit A 2-input _______ gate
◦ NAND is a functionally complete logic operation◦ Similarly, if you can add, all other arithmetic operations
can be derived from addition.◦ To conduct operations on _______ bit words is clearly
tedious (menjemukan)!◦ Goal then is to develop arithmetic and logic circuitry that
is algorithmically _______ while remaining cost effective
_______-_______ format◦ Positional representation using n bits◦ Left most bit position is the sign bit
0 for _______ number 1 for _______ number
◦ Remaining n-1 bits represent the _______◦ Range: {-2n-1-1, +2n-1-1}◦ Problems:
Sign must be considered during arithmetic operations Dual representation of zero (-0 and +0)
Ones ______________ format◦ Binary case of diminished (menyusut) _______
complement ◦ Negative numbers are represented by a bit-by-bit
______________ of the (positive) magnitude (the process of negation)
◦ Sign bit interpreted as in sign-magnitude format◦ Examples (8-bit words):
+42 = 0 00101010- 42 = 1 11010101
◦ Still have a _______ representation for zero (all zeros and all ones)
Twos ______________ format◦ Binary case of radix complement◦ Negative numbers, -X, are represented by the pseudo-
positive number 2n - |X|◦ With 2n symbols
2n-1-1 _______ numbers 2n-1 _______ numbers
◦ Given the representation for +X, the representation for -X is found by taking the 1s complement of +X and adding 1
◦ Caution: avoid confusion with “2s complement _______ (representation) and the 2s complement _______
◦ Converting between two word lengths (e.g., convert an 8-bit format into a 16-bit format) requires a sign extension: The _______ bit is extended from its current location up
to the new location All bits in the extension take on the value of the old
_______ bit
+18= 00010010+18= 00000000 00010010
-18= 11101110-18= 11111111 11101110
Use of a single _______ adder is the simplest hardware◦ Must implement an n-repetition for-loop for an n-bit
addition◦ This is lots of _______ for a typical addition
Use a _______ adder unit instead◦ n full adder units cascaded together◦ In adding X and Y together unit i adds Xi and Yi to
produce SUMi and CARRYi◦ Carry out of each stage is the carry in to the next stage◦ Worst case add time is n times the delay of each unit --
despite the _______ operation of each adder unit -- Order (n) delay
◦ With signed numbers, watch out for _______: when adding 2 positive or 2 negative numbers, _______ has occurred if the result has the _______ sign
Alternatives to the ripple adder◦ Must allow for the worst case delay in a ripple adder◦ In most cases, _______ signals do not propagate through
the entire adder◦ Provide additional hardware to detect where carries will
occur or when the carry _______ is completed◦ Carry Completion Sensing Adders use additional circuitry
to detect the time when all carries are completed Signal control unit that add is finished Essentially an ______________ device Typical add times are O(log n)
◦ Carry ___________ Adders Predict in advance what adder stage of a ripple adder
will generate a carry out Use prediction to avoid the carry propagation delays --
generate all of the carries at once Add time is a _______, regardless of the width, n, of the
word -- O(1) Problem: prediction in stage i requires information from
all previous stages -- gates to implement this require large numbers of inputs, making this adder impractical for even moderate values of n
To perform X-Y, realize that X-Y = X+(-Y)
Therefore, the following hardware is “typical”
A number of methods exist to perform integer multiplication◦ Repeated _______: add the multiplicand to itself
“multiplier” times◦ Shift and add -- traditional “pen and paper” way of
multiplying (extended to binary format)◦ High speed (special purpose) hardware multipliers
_______ addition◦ Least sophisticated method◦ Just use adder over and over again◦ If the multiplier is n bits, can have as many as 2n
iterations of addition -- O(2n) !!!!◦ Not used in an _______
Shift and add◦ Computer’s version of the pen and paper approach:
1011 (11)x 1101 (13)
===========1011
00000 Partial products 101100 1011000
=========== 10001111 (143)
◦ The computer version accumulates the partial products into a running (partial) sum as the algorithm progresses
◦ Each partial product generation results in an _______ and _______ operation
Shift and add hardware for unsigned integers
Shift and add flowchart for unsigned integers
To multiply signed numbers (2s ____________)◦ Normal shift and add does not work (problem in the
basic algorithm of no sign extension to 2n bits)◦ ________ all numbers to their positive magnitudes,
multiple, then figure out the correct sign◦ Use a method that works for both positive and negative
numbers ________ algorithm is popular (recoding the multiplier)
◦ ________ algorithm As in S&A, strings of 0s in the ________ only require
shifting (no addition steps) “Recode” strings of 1s to permit similar ________ String of 1s from 2u down to 2v is treated as 2u+1- 2v
In other words,- At the right end of a string of 1s in the multiplier, perform a ________- At the left end of the string perform an ________- For all of the 1s in between, just do
________ Hardware modifications required in (Figure shift and
add hardware for unsigned integers)- Ability to perform ________- Ability to perform ________ shifting rather than logical shifting (for sign extension)- A flip flop for bit Q-1
To determine ________ (add and shift, subtract and shift, shift) examine the bits Q0Q-1
- 00 or 11: just shift- 10: ________ and shift- 01: ________ and shift
Booth’s algorithm for multiplication
Advantages of Booth:- Treats positive and negative numbers
________- Strings of 1s and 0s can be skipped over
with shift operations for faster ________ time High performance multipliers
◦ ________ the computation time by employing more hardware than would normally be found in a S&A-type multiplier unit
◦ Not generally found in general-purpose processors due to expense
◦ Examples Combinational hardware multipliers Pipelined Wallace Tree adders from Carry-Save Adder
units
Once you have committed to implementing multiplication, implementing division is a relatively easy next step that utilizes much of the same hardware
Want to find quotient, Q, and remainder, R, such thatD = Q x V + R
Restoring division for ________ integers◦ Algorithm adapted from the traditional “pen and paper”
approach◦ Algorithm is of time complexity O(n) for n-bit dividend◦ Uses essentially the same ALU hardware as the ________
multiplication algorithm Adder / subtractor unit ________ wide shift register AQ that can be shifted to the
left ________ for the divisor Control logic
Restoring division algorithm for unsigned integers
For two’s complement numbers, must deal with the ________ extension “problem”
Algorithm:◦ Load M with divisor, AQ with dividend (using sign bit
extension)◦ ________ AQ left 1 position◦ If M and A have same sign, AA-M, otherwise AA+M◦ Q01 if sign bit of A has not changed or (A=0 AND
Q=0), otherwise Q0=0 and restore *A◦ Repeat ________ and +/- operations for all bits in Q◦ Remainder is in A, quotient in Q
If the signs of the divisor and the dividend were the same, quotient is correct, otherwise, Q is the 2’s complement of the quotient
2’s complement division examples
________ fixed point schemes do not have the ability to represent very large or very small numbers
Need the ability to dynamically ________ the decimal point to a convenient location
Format: +/-M x R +/-E
Significand / mantissas are stored in a ________ format◦ Either 1.xxxxx or 0.1xxxxx◦ Since the 1 is required, don’t need to explicitly store it in
the data word -- insert it for calculations only Exponents can be positive or negative values
◦ Use ________ (Excess coding) to avoid operating on negative exponents
◦ ________ is added to all exponents to store as positive numbers
For a fixed n-bit representation length, 2n combinations of symbols◦ If floating point ________ the range of numbers in the
format (compared to integer representation) then the “spacing” between the numbers must increase This causes a ________ in the format’s precision
◦ If more bits are allocated to the exponent, range is ________ at the expense of decreased precision
◦ Similarly, more significand bits increases the ________ and reduces the range
◦ The ________ is chosen at design time and is not explicitly represented in the format Small -- smaller range Large -- increased range but loss of significant bits as
a result of mantissa alignment when normalizing
Problems to deal with in the format◦ Representation of ________◦ Over and ________ and how to detect◦ ________ operations
IEEE 754 format◦ Defines single and double ________ formats (32 and
64 bits)◦ Standardizes formats across many different
platforms◦ Radix 2◦ Single
Range 10-38 to 10+38
8-bit exponent with 127 bias 23-bit mantissa
◦ Double Range 10-308 to 10+308
11-bit exponent with 1023 bias 52-bit mantissa
IEEE 754 Formats
Floating point arithmetic operations◦ Addition and subtraction
________ significand Add or subtract significand Post ________
◦ Multiplication ________ exponents Multiply significand Post normalize
◦ Division ________ exponents Divide significand Post normalize
In this section, we have focused on the operation of the CPU◦ Registers and their use◦ Instruction execution
Looked at the basicd concepts associated with computer arithmetic◦ Number representation◦ Basic ALU construction◦ Hardware and software implementations of multiplication
and division operations◦ Floating point numbers and operations
Computer Organization and Architecture, 6th Edition. Stallings, W. Prentice Hall.
Computer Organization and Design. David A. Patterson, John L. Hennessy. Morgan Kaufmann
Top Related