Design of chip controller
-
date post
21-Oct-2014 -
Category
Engineering
-
view
207 -
download
1
description
Transcript of Design of chip controller
CHAPTER - 1: INTRODUCTION
-
Introduction
Digital design is a broad and amazing field. The application of digital design is present in our daily life, including computers, calculators, video cameras etc. In fact there will be always need for high speed and low power digital products which makes digital design a future growing business. ALU (arithmetic logic unit) is a critical component of a microprocessor and central processing unit. Furthermore it is the heart of the instruction execution portion of every computer. ALU's comprises of combinational logic that implements logical operations such as AND, OR etc., and arithmetic operations such as ADD, SUB etc. ALU can be built with various specifications. A simple ALU has two inputs for operands and one input for control signal that selects the operation and one output for the result. The goal of this project is to design a CHIP CONTROLLER consisting of a control unit, 16 bit ALU with memory which executes various arithmetic and logical operations. The hardware uses accumulator or registers to store each result. When an input operand has been read and the appropriate control signal has been passed to the control unit will perform the computation and output the result. The control unit provides the necessary timing and control signals to all the operations in the ALU.
- 2 - -
-
CHAPTER-2: BLOCK DIAGRAM AND ITS FUNCTIONALITY
- 3 - -
BLOCK DIAGRAM AND ITS
FUNCTIONALITY
2.1 BLOCK DIAGRAM
The main blocks of the processor are
Control unit
ALU
Memory
CONTROL UNIT
It is the main block of the processor which controls the ALU. The 16 bit inputs, one 6 bit se-
lection line and clock input are given to this control unit. So that for every positive edge of
the clock the control unit must take the input and also output is given by control unit. The
control unit then tells the ALU what operation to be performed on the data with the help of
selection lines. It also tells the ALU whether to access memory or not.
ALU
- 4 - -
CONTROL UNIT
16-BITALU
MEMORY16 X 65KBITS
[15:0]INPUT1
[15:0]INPUT2[5:0]SELECT
CLOCK[15:0]OUTPUT
CARRY
RW[15:0]OUT
CARRY
[15:0]INPUT1
[15:0]INPUT2
[3:0]OPC
[15:0]AD
DR
[15:0]DA
TAIN
[15:0]DA
TAO
UT
RW EN
EN
BLOCK DIAGRAM AND ITS
FUNCTIONALITY
As far as the ALU in our design is concerned, it loads data from two 16 bit data lines. The
ALU performs operations on that data according to the instructions given by the control unit.
These instructions are given using opcode to ALU. ALU in our design can perform arithmetic
operations like addition, subtraction, multiplication, comparison and logic operations like
AND, OR, XOR, XNOR, NOR, BUFFER, NOT. The ALU which is used in our design is
static ALU.
MEMORY
The memory which is used in our design is Harvard memory because it can be used as data
memory only. Memory takes the address from 16 bit address line. It uses two lines for read-
ing and writing data from and to memory respectively. The size of memory used is 16 x 64
kbits.
- 5 - -
2.2 OPCODES
S.NO
OP-CODE
SELECTION LINES FUNCTION MATHEMATICAL REPRESENTAT-
IONS[5] S[4] S[3] S[2] S[1] S[0]
1 32 1 0 0 0 0 0 Addition A + B2 33 1 0 0 0 0 1 Subtraction A - B3 34 1 0 0 0 1 0 Multiplication A * B4 35 1 0 0 0 1 1 Or A | B5 36 1 0 0 1 0 0 And A & B6 37 1 0 0 1 0 1 Xor A ^ B7 38 1 0 0 1 1 0 Not ~ A8 39 1 0 0 1 1 1 Xnor A ~^ B9 41 1 0 1 0 0 1 Comparison A>B,A<B,A=B10 46 1 0 1 1 1 0 Buffer A11 47 1 0 1 1 1 1 Buffer B12 48 1 1 0 0 0 0 Addition A + MEM(B)13 49 1 1 0 0 0 1 Subtraction A - MEM(B)14 50 1 1 0 0 1 0 Multiplication A * MEM(B)15 51 1 1 0 0 1 1 Or A | MEM(B)16 52 1 1 0 1 0 0 And A & MEM(B)17 53 1 1 0 1 0 1 Xor A ^ MEM(B)18 54 1 1 0 1 1 0 Not ~ MEM(A)19 55 1 1 0 1 1 1 Xnor A ~^ MEM(B)20 57 1 1 1 0 0 1 Comparison A > MEM(B),
A < MEM(B),A = MEM(B)
21 62 1 1 1 1 1 0 Buffer Move from memory22 15 0 0 1 1 1 1 Buffer Move to memory
In selection lines each bit has its own significance. First four bits are used for selecting a particular operation that has to perform on input data. Fifth bit tells the ALU
- 6 - -
CHAPTER-3: DESIGN METHODOLOGY
- 7 - -
` DESIGN METHODOLOGY
3.1 DESIGN METHODOLOGY
There were several ways to approach creating the ALU. Our group wanted to make
use of the CAS (Complimentary Addition and Subtraction) unit so that 2’s complement
arithmetic could be performed in one operation rather than several executions through the
ALU. This led to the design in which multiple functions being executed simultaneously and
the desired output was chosen using a multiplexer network
3.2 ALU DESIGN
The ALU designed in this project performs ten different operations on two 16-bit
inputs with and without using memory. The advanced design utilizes the carry-look-ahead
method for carry generations in order to speed up the performance of the ALU.
3.3 ADDER/SUBTRACTOR UNIT
3.3.1 CARRY LOOK AHEAD GENERATOR:
The parallel adder discussed in the last paragraph is ripple carry type in which the
carry output of each full-adder stage is connected to the carry input of the next higher order
stage. Therefore, the sum and carry outputs of any stage cannot be produced until the input
carry occurs; this leads to a time delay in the addition process. This delay is known as carry
propagation delay, which can be best explained by considering the following addition,
0 1 0 1
+ 0 0 1 1
= 1 0 0 0
Addition of the LSB position produces a carry into the second position. This carry,
when added to the bits of the second position (stage), produces a carry into the third position.
The key thing to notice in this example is that the sum bit generated in the last position
(MSB) depends on the carry that was generated by the addition in the previous positions. This
means that, adder will not produce correct result until LSB carry has propagated through the
intermediate full-adders. This represents a time delay that depends on the propagation delay
produced in an each full-adder. For example, if each full-adder is considered to have a
propagation delay of 30 ns, then S3 will not reach its correct value until 90 ns after LSB carry
is generated. Therefore, total time required to perform addition is 90+30=120 ns.
- 8 - -
COUT
` DESIGN METHODOLOGY
Obviously, this situation becomes much worse if we extend the adder circuit to add a
greater number of bits. If the adder were handling 16-bit numbers, the carry propagation
delay could be 480 ns.
One method of speeding up this process by eliminating inter stage carry delay is
called look ahead-carry addition. This method utilizes logic gates to look at the lower order
bits of the augends and addend to see if a higher order carry is o be generated. It uses two
functions: carry generate and carry propagate.
Consider the circuit of the full-adder show in fig 1. Here, we define two functions:
carry generate and carry propagate.
Pi = Ai + Bi
Gi = AiBi
The output sum and carry can be expressed as
Si = Pi + Ci
Ci+1 = Gi + PiCi
Gi is called a carry generate and it produces on carry when both A i and Bi are one,
regardless of the input carry. Pi is called a carry propagate because it is term associated with
the propagation of the carry form Ci to Ci+1.
Now the Boolean function for the carry output of each stage can be written as follows,
C1 = G1 + P1C1
C3 = G2 + P2C2 = G2 + P2 (G1 + P1C1)
= G2 + P2G1 + P2P1C1
C4 = G3 + P3C3 = G3 + P3 (G2 + P2G1 + P2P1C1)
= G3 + P3G2 + P3P2G1 + P3P2P1C1
From the above Boolean function it can be seen that C4 does not have to wait for C3
and C2 to propagate; in fact C4 is propagated at the same time as C2 and C3.
The Boolean function for each output carry are expressed in sum of product form,
thus they can implemented using AND-OR logic or NAND-NAND logic. Fig 2 shows
implementation of Boolean functions for C2 , C3 and C4 using AND-OR logic.
3.3.6 LOGIC DIAGRAM OF A CARRY LOOK AHEAD GENERATOR
- 9 - -
` DESIGN METHODOLOGY
Using a look ahead carry generator we can easily construct a 4-bit parallel adder with a look
ahead carry
scheme. Fig 3 shows a 4-bit parallel adder with a look ahead carry scheme. As shown in the
fig 3, each sum output requires two exclusive-OR gates. The output of the first exclusive-OR
gate generates Pi , and the AND gate generates Gi . The carries are generated using look-ahead
carry generator and applied as inputs to the second exclusive-OR gate. Other inputs to
exclusive-OR gate generate sum output. Each output is generated after a delay of two levels
of gate. Thus outputs S2 through S4 have equal propagation delay times.
3.3.7 4-BIT PARALLEL ADDER WITH LOOK AHEAD CARRY GENERATOR
- 10 - -
C4
C3
C2
C1
G1
G2
G3
P1
P2
P3
` DESIGN METHODOLOGY
- 11 - -
P4
P3
P2
P1
G1
G4
G3
G2
C1C1
B1
A1
B2
A2
B3
A3
C1
C5
C3
C4
C2
P4
P3
P2
P1 S1
S2
S3
S4
S5
CARRY LOOK AHEAD
GENERATOR
A4
B4
` DESIGN METHODOLOGY
3.4 BINARY SUBTRACTOR:
The subtraction of unsigned binary numbers can be done most conveniently by means
of complements. Remember that the subtraction A-B can be done by taking the 2’s
complement of B and adding it to A. The 2’s complement can be obtained by taking the 1’s
complement and adding one to the least significant pair of bits. The 1’s complement can be
implemented with inverters and a one can be added to the sum through the input carry.
The circuit for subtracting A-B consists of an adder with inverters placed between
each data input B and the corresponding input of the full-adder. The input carry C0 must be
equal to 1 when performing subtraction. The operation thus performed becomes A, plus the
1’s complement of B, plus 1. This is equal to A plus the 2’s complement of B. For unsigned
numbers, this gives A-B if A>=B or the 2’s complement of (B-A) if A<B. For signed
numbers, the result is A-B, provided that there is no overflow.
The addition and subtraction operations can be combined into one circuit with one
common binary adder. This is done by including an exclusive-OR gate receives input M and
one of the inputs of B. When M=0, we have B+0=B. the full-adders receive the value of B,
the input carry is 0,and the circuit performs A plus B, when M=1we have B+1=B’ and C0=1.
The B inputs are all complemented and a is added through the input carry. The circuit
performs the operation A plus the 2’s complement of B (The exclusive-OR with output V is
for detecting an overflow).
It is worth noting that binary numbers in the signed-complement system are added
and subtracted by the same basic addition and subtraction rules as unsigned numbers.
Therefore, computers need only one common hardware circuit to handle both types of
arithmetic. The user or programmer must interpret the results of such addition or subtraction
differently, depending on whether it is assumed that the numbers are signed or unsigned.
3.5 MULTIPLICATION
Multiplication and division follow the same mathematical rules used in decimal
numbering. However, their implementation is substantially more complex as compared to
- 12 - -
` DESIGN METHODOLOGY
addition and subtraction. Multiplication can be performed inside a computer in the same way
that a person does so on paper. Consider 12 × 12 = 144.
1 2
X 1 2
2 4 Partial product × 100
+ 1 2 Partial product × 101
1 4 4 Final product
The multiplication process grows in steps as the number of digits in each multiplicand
increases, because the number of partial products increases. Binary numbers function the
same way, but there easily can be many partial products, because numbers require more digits
to represent them in binary versus decimal. Here is the same multiplication expressed in
binary (1100 × 1100 = 10010000):
1 1 0 0
X 1 1 0 0
0 0 0 0 Partial product × 20
0 0 0 0 Partial product × 21
1 1 0 0 Partial product × 22
+ 1 1 0 0 Partial product × 23
1 0 0 1 0 0 0 0 Final product
Walking through these partial products takes extra logic and time, which is why
multiplication and, by extension, division are considered advanced operations that are not
nearly as common as addition and subtraction. Methods of implementing these functions
require trade-offs between logic complexity and the time required to calculate a final result.
To see how a binary multiplier can be implemented with a combinational circuit,
consider the multiplication of two 2-bit numbers as shown in figure. The multiplicand bits are
B1 and B0, the multiplier bits are A1 and A0, and the product is C3 C2 C1 C0. The first partial
product is formed by multiplying A0 by B1B0. The partial product can be implemented with
AND gates as shown in the diagram. The second partial product is formed by multiplying A 1
by B1 B0 and shifting one position to the left. The two partial products are added with two
- 13 - -
` DESIGN METHODOLOGY
half adder (HA) circuits. Usually there are more bits in the partial products and it is necessary
to use full adders to produce the sum of the partial products. Note that the least significant bit
of the product does not have to go through an adder since it is formed by the output of first
AND gate. B1 B0
B1 B0
A1 A0
A0B1 A0B0
A1B1 A1B0
C3 C2 C1 C0
A combinational
circuit binary multiplier
with more bits can be
constructed in a similar
fashion. A of multiplier is
ANDed with each bit of the multiplicand in as many levels as there are bits in the multiplier.
The binary output in each level of AND gates are added with the partial products of the
previous level to form a new partial product. The last level produces the product. For ‘J’
multiplier bits and ‘K’ multiplicands bits we need (J x K) AND gates and (J – 1) K- bit
adders to produce a product of J + K bits.
3.6 COMPARATOR
The comparison of two numbers is as operation that determines if one number is
greater than, less than, or equal to the other number. A magnitude comparator is a
combinational circuit that compares two numbers, A and B, and determines their relative
- 14 - -
HAHA
A0
A1
B1 B0
B1 B0
C1 C0C3 C2
` DESIGN METHODOLOGY
magnitudes. The outcome of the comparison is specified by three binary variables that
indicate whether A>B, A=B, or A<B.
The circuit for comparing two n-bit numbers has 22n entries in the truth table and
becomes too cumbersome even with n=3. On the other hand, as one may suspect, a
comparator circuit possess a certain amount of regularity. Digital function that possesses an
inherent well-defined regularity can usually be designed by means of an algorithmic
procedure. An algorithm is a procedure that specifies a finite set of steps that, if followed,
give the solution to the problem. We illustrate this method here by deriving an algorithm for
the design of a 4-bit magnitude comparator.
The algorithm is a direct application of the procedure a person uses to compare the
relative magnitudes of two numbers. Consider two numbers, A and B, with four digits each.
Write the coefficients of the numbers with descending significance.
A=A3A2A1A0
B=B3B2B1B0
Each subscripted letter represents one of the digits in the number. The two numbers
are equal if all pairs of significant digits are equal: A3 = B3 and A2 = B2 and A1 = B1 and A0 = B0
. When the numbers are binary, the digits are either 1 or 0, and the equality relation of each
pair of bits can be expressed logically with an exclusive-OR function as
Xi = AiBi for i = 0,1,2,3
Where xi =1 only if the pair of bits in position I are equal (i.e., if both are 1 or both are 0).
The equality of two numbers, A and B, is displayed in a combinational circuit by an
output binary variable that we designate by the symbol (A=B). This binary variable is equal
to 1 if the input numbers, A and B, are equal, and it is equal to 0 otherwise. For the equality
condition to exist, all xi variables must be equal to 1. This dictates an AND operation of all
variables:
(A=B)=x3x2x1x0
The binary variable (A=B) is equal to 1 only if all pairs of digits of the two numbers
are equal.
To determine if A is greater than or less than B, we inspect the relative magnitudes of
pairs of significant digits starting from the most significant position. If the two digits are
equal, we compare the next lower significant pair of digits. This comparison continues until a
pair of unequal digits is reached. If the corresponding digit of A is 1 and that of B is 0, we
conclude that A>B. if the corresponding digit of A is 0 and that of B is 1, we have that A<B.
the sequential comparison can be expressed logically by the two Boolean functions
- 15 - -
` DESIGN METHODOLOGY
(A>B)=A3B’3 + x3A2B’2 + x3x2A1B’1 + x3x2x1A0B’0
(A<B)=A’3B3 + x3A’2B2 + x3 x2A’1B1 + x3x2x1A’0B0
The symbols (A>B) and (A<B) are binary output variables that are equal to 1 when A>B or
A<B, respectively.
3.6.1. 4-BIT MAGNITUDE COMPARATOR
The gate implementation of the three output variables just derived is simpler than it
seems because it involves a certain amount of repetition. The unequal outputs can use the
same gates that are needed to generate the equal output. The logic diagram of the 4-bit
magnitude comparator is shown in fig 1. The four x outputs are generated with exclusive-
NOR circuits and applied to an AND gate to give the output binary variable (A=B). The other
two outputs use the x variables to generate the Boolean functions listed previously. This is a
multilevel implementation and has a regular pattern. The procedure for obtaining magnitude
comparator circuits for binary numbers with more than four bits is obvious from this
example.
- 16 - -
A3
A2
A1
A0
B3
B2
B1
B0
X3
X2
X1
X0
(A<B)
(A>B)
(A=B)
` DESIGN METHODOLOGY
3.7 LOGIC GATES
Logic gates are the building blocks of digital electronics. The fundamental logic gates
include the INVERT (NOT), AND, NAND, OR, exclusive OR (XOR), and exclusive NOR
(XNOR) gates. Each of these gates performs a different logical operation. A description of
what each logic gate does and switch and transistor analogy for each gate is disused here.
3.7.1 INVERTER (NOT)
SYMBOL:
TRUTH TABLE:
Electronic implementation OF INVERTER:
NMOS
Inverter
PMOS
Inverter
Static CMOS
Inverter
Schematic of a Saturated-Load Digital
Inverter
DESCRIPTION:
Y=~A
A NOT gate or invertors. Output logic level is opposite to that of the input logic
level
- 17 - -
INPUT OUTPUT
A NOT A
0 1
1 0
` DESIGN METHODOLOGY
3.7.2 AND
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF
INVERTER:
DESCRIPTION:
` Y=A & B.
The output of the AND gate is high only when both the inputs are high.
3.7.3 OR
SYMBOL:
- 18 - -
INPUT OUTPUT
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1
` DESIGN METHODOLOGY
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
CMOS OR Gate
DESCRIPTION:
Y= A | B
The output of the OR gate is high when one or both the inputs are high.
3.7.4 XOR
SYMBOL:
TRUTH TABLE:
- 19 - -
INPUT OUTPUT
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1
` DESIGN METHODOLOGY
ELECTRONIC IMPLEMENTATION OF INVERTER:
DESCRIPTION:
OUT=A ^ B
The output of the XOR gate goes high if both the inputs are same.
3.7.5 XNOR
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
- 20 - -
INPUT OUTPUT
A B A XOR B
0 0 0
0 1 1
1 0 1
1 1 0
INPUT OUTPUT
A B A XNOR B
0 0 1
0 1 0
1 0 0
1 1 1
` DESIGN METHODOLOGY
DESCRIPTION:
Y=A ^~ B
The output of the XOR gate goes high if both the inputs are different
- 21 - -
` DESIGN METHODOLOGY
CHAPTER-4: MEMORY
- 22 - -
MEMORY
4.1 MEMORY
Since the dawn of the electronic era, memory or storage devices have been an integral
part of electronic systems. As the electronic industry matured and moved away from vacuum
tubes to semiconductor devices, research in the area of semiconductor memories also
intensified. Semiconductor memory uses semiconductor-based integrated circuits to store
information. The semiconductor memory industry evolved and prospered along with the
digital computer revolution. Today, semiconductor memory arrays are widely used in many
VLSI subsystems, including microprocessors and other digital systems. In these systems, they
are used to store programs and data and in almost all cases have replaced core memory as the
active main memory. More than half of the real estate in many state-of-the art
microprocessors is devoted to cache memories, which are essentially semiconductor memory
arrays. System designer’s (both hardware and software) unmitigated quest for more memory
capacity has accelerated the growth of the semiconductor memory industry. One of the
factors that determine a digital computer’s performance improvement is its ability to store
and retrieve massive amounts of data quickly and inexpensively. Since the beginning of the
computer age, this fact has led to the search for ideal memories. The ideal memory would be
low cost, high performance, high density, with low-power dissipation, random access,
nonvolatile, easy to test, highly reliable, and standardized throughout the industry.
Unfortunately, a single memory having all these characteristics has not yet been developed,
although each of the characteristics is held by one or another of the MOS memories. Today,
MOS memories dominate the semiconductor memory market.
4.2. MEMORY CLASSIFICATION
Semiconductor memories can be classified in many different ways. Semiconductor
memories are generally classified based on the basic operation mode, nature of the data
storage mechanism, access patterns, and the storage cell operation.
Basic operation mode: Some memory circuits allow modification of information. In
other words, we can read data from the memory and write new data into the memory,
whereas other types of memory only allow reading of prewritten information. On the basis of
this criterion, memories are classified into two major categories: Read=write memories
(RWMs) and ROMs. RWMs are more popularly referred to as random access memories
(RAMs). In the early days, RAMs were referred to by that name to contrast them with non
semiconductor memories such as magnetic tapes that allow only sequential access. It should
be noted that ROMs also allow random access the way RAMs do; however, they are not
generally called RAMs.
- 23 - -
MEMORY
Storage mode: On the basis of its ability to retain the stored information with respect
to the ON=OFF state of the power supply, semiconductor memories can be classified into
two types: volatile and nonvolatile memories. Volatile memory loses all the stored
information once the power supply is turned OFF. RAM is an example of volatile memory.
Nonvolatile memory, on the other hand, retains the stored information even when the power
supply is turned OFF. ROMs and flash memories are examples of nonvolatile memories.
Nonvolatile memories can be further divided into two categories: nonvolatile ROMs (e.g.,
mask-programmed ROM) and nonvolatile read–write memories (e.g., Flash, EPROM, and
EEPROM) (Table).
Table: Memory Classification
Access patterns: On the basis of the order in which data can be accessed, memories
can be classified into two different categories: RAMs and non-RAMs. Most memories belong
to the random access class. In RAMs, information can be stored or retrieved in a random
order at a fixed rate, independent of physical location. There are two kinds of RAMs: static
random access memories (SRAMs) and dynamic random access memories (DRAMs). In
SRAMs, data is stored in a latch and it retains the data written on the cell as long as the
power supply to the memory is retained. In DRAMs, the data is stored in a capacitance as
electric charge and the written data needs to be periodically refreshed to compensate for the
charge leakage of the capacitance. It should be noted that both SRAM and DRAM are
volatile memories, i.e., they lose the written information as soon as the power supply is
turned OFF.
Examples of non-RAMs are serial access memory (SAM) and content address
memories (CAMs). SAM can be visualized as the opposite of RAM. SAM stores data as a
series of memory cells that can only be accessed sequentially. If the data is not in the current
location, each memory cell is checked until the needed data is found. SAM works very well
for memory buffers, where the data is normally stored in the order in which it will be used.
- 24 - -
MEMORY
Texture buffer memory on a video card is an example of SAM. In RAM, we give an address
to the memory chip and we can retrieve the information stored in that particular address. But
a CAM is designed such that when a data word (an assemblage of bits usually the width of
the address bus) is supplied to the chip, the CAM searches its entire memory to see if that
data word is stored anywhere in the chip. If the data word is found, the CAM returns a list of
one or more storage addresses where the word was found and in some architectures, it also
returns the data word.
Finally, there needs to be a way to denote how much data can be stored by any
particular memory device. This, fortunately for us, is very simple and straightforward: just
count up the number of bits (or bytes, 1 byte = 8 bits) of total data storage space. Due to the
high capacity of modern data storage devices, metric prefixes are generally affixed to the unit
of bytes in order to represent storage space: 1.6 Gigabytes is equal to 1.6 billion bytes, or
12.8 billion bits, of data storage capacity. The only caveat here is to be aware of rounded
numbers. Because the storage mechanisms of many random-access memory devices are
typically arranged so that the number of "cells" in which bits of data can be stored appears in
binary progression (powers of 2), a "one kilobyte" memory device most likely contains 1024
(2 to the power of 10) locations for data bytes rather than exactly 1000. A "64 kbyte" memory
device actually holds 65,536 bytes of data (2 to the 16th power), and should probably be
called a "66 Kbyte" device to be more precise. When we round numbers in our base-10
system, we fall out of step with the round equivalents in the base-2 system.
One simple memory circuit is called the data latch, or D-latch. This is a device which,
when “told” to do so via the clock input, notes the state of its input and holds that state at its
output. The output state remains unchanged even if the input state changes, until another
update request is received. Traditionally, the input of the D-latch is designated by D and the
latched output by Q. The update command is provided by asserting the clock input in the
form of a transition (from HI to LO) or (from LO to HI), so-called edge-triggered devices or
level triggered devices, where the output follows the input whenever the clock is HI.
- 25 - -
MEMORY
D-Latch Symbol and Truth Tables
Data present on the input D is passed to the outputs Q and Q when the clock is
asserted. The truth table for an edge-triggered D-latch is shown to the right of the schematic
symbol. Some D-latches also have preset and Clear inputs that allow the output to be set HI
or LO independent of the clock signal. In normal operation, these two inputs are pulled high
so as not to interfere with the clocked logic. However, the outputs Q and Q can be initialized
to a known state, using the Preset and Clear inputs when the clocked logic is not active.
- 26 - -
CHAPTER-5: VERILOG
VERILOG
5.1 VERILOG
In the semiconductor and electronic design industry, Verilog is a hardware description
language (HDL) used to model electronic systems. Verilog HDL, not to be confused with
VHDL, is most commonly used in the design, verification, and implementation of digital
logic chips at the Register transfer level (RTL) level of abstraction. It is also used in the
verification of analog and mixed-signal circuits
5.2 HISTORY OF VERILOG
Beginning
Verilog was invented by Phil Moorby and Prabhu Goel during the winter of 1983/1984 at
Automated Integrated Design Systems (later renamed to Gateway Design Automation in
1985) as a hardware modeling language. Gateway Design Automation was later purchased by
Cadence Design Systems in 1990. Cadence now has full proprietary rights to Gateway's
Verilog and the Verilog-XL simulator logic simulators.
Verilog-95
With the increasing success of VHDL at the time, Cadence decided to make the language
available for open standardization. Cadence transferred Verilog into the public domain under
the Open Verilog International (OVI) (now known as Accellera) organization. Verilog was
later submitted to IEEE and became IEEE Standard 1364-1995, commonly referred to as
Verilog-95.
In the same time frame Cadence initiated the creation of Verilog-A to put standards support
behind its analog simulator Spectre. Verilog-A was never intended to be a standalone
language and is a subset of Verilog-AMS which encompassed Verilog-95.
Verilog 2001
Extensions to Verilog-95 were submitted back to IEEE to cover the deficiencies that users
had found in the original Verilog standard. These extensions became IEEE Standard 1364-
2001 known as Verilog-2001.
Verilog-2001 is a significant upgrade from Verilog-95. First, it adds explicit support for (2's
complement) signed nets and variables. Previously, code authors had to perform signed-
operations using awkward bit-level manipulations (for example, the carry-out bit of a simple
8-bit addition required an explicit description of the boolean-algebra to determine its correct
value.) The same function under Verilog-2001 can be more succinctly described by one of
the built-in operators: +, -, /, *, >>>. A generate/endgenerate construct (similar to VHDL's
generate/endgenerate) allows Verilog-2001 to control instance and statement instantiation
VERILOG
through normal decision-operators (case/if/else). Using generate/endgenerate, Verilog-2001
can instantiate an array of instances, with control over the connectivity of the individual
instances. File I/O has been improved by several new system-tasks. And finally, a few syntax
additions were introduced to improve code-readability (eg. always @*, named-parameter
override, C-style function/task/module header declaration.)
Verilog-2001 is the dominant flavor of Verilog supported by the majority of commercial
EDA software packages.
Verilog 2005
Not to be confused with SystemVerilog, Verilog 2005 (IEEE Standard 1364-2005) consists of
minor corrections, spec clarifications, and a few new language features (such as the uwire
keyword.)
A separate part of the Verilog standard , Verilog-AMS, attempts to integrate analog and
mixed signal modelling with traditional Verilog.
SYSTEM VERILOG
SystemVerilog is a superset of Verilog-2005, with many new features and capabilities to aid
design-verification and design-modeling.
The advent of High Level Verification languages such as OpenVera, and Verisity's E
language encouraged the development of Superlog by Co-Design Automation Inc. Co-Design
Automation Inc was later purchased by Synopsys. The foundations of Superlog and Vera
were donated to Accellera, which later became the IEEE standard P1800-2005:
SystemVerilog.
5.3 ABOUT LANGUAGE
Hardware description languages, such as Verilog, differ from software programming
languages in several fundamental ways. HDLs add the concept of concurrency, which is
parallel execution of multiple statements in explicitly specified threads, propagation of time,
and signal dependency (sensitivity). There are two assignment operators, a blocking
assignment (=), and a non-blocking (<=) assignment. The non-blocking assignment allows
designers to describe a state-machine update without needing to declare and use temporary
storage variables. Since these concepts are part of the Verilog's language semantics, designers
could quickly write descriptions of large circuits, in a relatively compact and concise form.
At the time of Verilog's introduction (1984), Verilog represented a tremendous productivity
VERILOG
improvement for circuit designers who were already using graphical schematic-capture, and
specially-written software programs to document and simulate electronic circuits.
The designers of Verilog wanted a language with syntax similar to the C programming
language, which was already widely used in engineering software development. Verilog is
case-sensitive, has a basic preprocessor (though less sophisticated than ANSI C/C++), and
equivalent control flow keywords (if/else, for, while, case, etc.), and compatible language
operators precedence. Syntactic differences include variable declaration (Verilog requires bit-
widths on net/reg types), demarcation of procedural-blocks (begin/end instead of curly braces
{}), though there are many other minor differences.
A Verilog design consists of a hierarchy of modules. Modules encapsulate design hierarchy,
and communicate with other modules through a set of declared input, output, and
bidirectional ports. Internally, a module can contain any combination of the following:
net/variable declarations (wire, reg, integer, etc.), concurrent and sequential statement blocks
and instances of other modules (sub-hierarchies). Sequential statements are placed inside a
begin/end block and executed in sequential order within the block. But the blocks themselves
are executed concurrently, qualifying Verilog as a Dataflow language.
Verilog's concept of 'wire' consists of both signal values (4-state: "1, 0, floating, undefined"),
and strengths (strong, weak, etc.) This system allows abstract modeling of shared signal-lines,
where multiple sources drive a common net. When a wire has multiple drivers, the wire's
(readable) value is resolved by a function of the source drivers and their strengths.
A subset of statements in the Verilog language is synthesizable. Verilog modules that
conform to a synthsizeable coding-style, known as RTL (register transfer level), can be
physically realized by synthesis software. Synthesis-software algorithmically transforms the
(abstract) Verilog source into a netlist, a logically-equivalent description consisting only of
elementary logic primitives (AND, OR, NOT, flipflops, etc.) that are available in a specific
VLSI technology. Further manipulations to the netlist ultimately lead to a circuit fabrication
blueprint (such as a photo mask-set for an ASIC), or a bitstream-file for an FPGA)
There are now two industry standard hardware description languages, VHDL and Verilog.
The complexity of ASIC and FPGA designs has meant an increase in the number of specialist
design consultants with specific tools and with their own libraries of macro and mega cells
written in either VHDL or Verilog. As a result, it is important that designers know both
VHDL and Verilog and that EDA tools vendors provide tools that provide an environment
allowing both languages to be used in unison. For example, a designer might have a model of
VERILOG
a PCI bus interface written in VHDL, but wants to use it in a design with macros written in
Verilog.
VHDL (Very high speed integrated circuit Hardware Description Language) became IEEE
standard 1076 in 1987. It was updated in 1993 and is known today as "IEEE standard 1076
1993". The Verilog hardware description language has been used far longer than VHDL and
has been used extensively since it was launched by Gateway in 1983. Cadence bought
Gateway in 1989 and opened Verilog to the public domain in 1990. It became IEEE standard
1364 in December 1995.
There are two aspects to modeling hardware that any hardware description language
facilitates; true abstract behavior and hardware structure. This means modeled hardware
behavior is not prejudiced by structural or design aspects of hardware intent and that
hardware structure is capable of being modeled irrespective of the design's behavior.
5.4 VHDL/VERILOG COMPARED & CONTRASTED
This section compares and contrasts individual aspects of the two languages; they are listed in
alphabetical order.
Capability
Hardware structure can be modeled equally effectively in both VHDL and Verilog. When
modeling abstract hardware, the capability of VHDL can sometimes only be achieved in
Verilog when using the PLI. The choice of which to use is not therefore based solely on
technical capability but on: personal preferences
EDA tool availability commercial, business and marketing issues
The modeling constructs of VHDL and Verilog cover a slightly different spectrum across the
levels of behavioral abstraction; see Figure 1.
HDL modeling capability
VERILOG
COMPILATION
VHDL, Multiple design-units (entity/architecture pairs), that resides in the same system file
may be separately compiled if so desired. However, it is good design practice to keep each
design unit in it's own system file in which case separate compilation should not be an issue.
The Verilog language is still rooted in it's native interpretative mode. Compilation is a means
of speeding up simulation, but has not changed the original nature of the language. As a result
care must be taken with both the compilation order of code written in a single file and the
compilation order of multiple files. Simulation results can change by simply changing the
order of compilation.
DATA TYPES
Verilog when Compared to VHDL, Verilog data types a re very simple, easy to use and very
much geared towards modeling hardware structure as opposed to abstract hardware modeling.
Unlike VHDL, all data types used in a Verilog model are defined by the Verilog language
and not by the user. There are net data types, for example wire, and a register data type called
reg. A model with a signal whose type is one of the net data types has a corresponding
electrical wire in the implied modeled circuit. Objects, those are signals, of type reg hold their
value over simulation delta cycles and should not be confused with the modeling of a
hardware register. Verilog may be preferred because of it's simplicity.
Design reusability
Verilog, There is no concept of packages in Verilog. Functions and procedures used within a
model must be defined in the module. To make functions and procedures generally accessible
from different module statements the functions and procedures must be placed in a separate
system file and included using the `include compiler directive.
Easiest to Learn
Starting with zero knowledge of either language, Verilog is probably the easiest to grasp and
understand. This assumes the Verilog compiler directive language for simulation and the PLI
language is not included. If these languages are included they can be looked upon as two
additional languages that need to be learned. VHDL may seem less intuitive at first for two
primary reasons. First, it is very strongly typed; a feature that makes it robust and powerful
for the advanced user after a longer learning phase. Second, there are many ways to model
the same circuit, especially those with large hierarchical structures.
Forward and back annotation
A spin-off from Verilog is the Standard Delay Format (SDF). This is a general purpose
format used to define the timing delays in a circuit. The format provides a bidirectional link
VERILOG
between, chip layout tools, and either synthesis or simulation tools, in order to provide more
accurate timing representations. The SDF format is now an industry standard in it's own right.
High level constructs Verilog. Except for being able to parameterize models by overloading
parameter constants, there is no equivalent to the high-level VHDL modeling statements in
Verilog.
LANGUAGE EXTENSIONS
The use of language extensions will make a model non standard and most likely not portable
across other design tools. However, sometimes they are necessary in order to achieve the
desired results.
Verilog The Programming Language Interface (PLI) is an interface mechanism between
Verilog models and Verilog software tools. For example, a designer, or more likely, a Verilog
tool vendor, can specify user defined tasks or functions in the C programming language, and
then call them from the Verilog source description. Use of such tasks or functions make a
Verilog model nonstandard and so may not be usable by other Verilog tools. Their use is not
recommended.
Libraries
Verilog. There is no concept of a library in Verilog. This is due to it's origins as an
interpretive language.
Low Level Constructs
Verilog. The Verilog language was originally developed with gate level modeling in mind,
and so has very good constructs for modeling at this level and for modeling the cell
primitives of ASIC and FPGA libraries. Examples include User Defined Primitive s (UDP),
truth tables and the specify block for specifying timing delays across a module.
Managing large designs
Verilog. There are no statements in Verilog that help manage large designs.
Operators
The majority of operators are the same between the two languages. Verilog does have very
useful unary reduction operators that are not in VHDL. A loop statement can be used in
VHDL to perform the same operation as a Verilog unary reduction operator. VHDL has the
mod operator that is not found in Verilog.
Parameterizable models
Verilog. A specific width model can be instantiated from a generic n-bit model using
overloaded parameter values. The generic model must have a default parameter value
defined. This means two things. In the absence of an overloaded value being specified, it will
VERILOG
still synthesize, but will use the specified default parameter value. Also, it does not need to be
instantiated with an overloaded parameter value specified, before it will synthesize.
Procedures and tasks
VHDL allows concurrent procedure calls; Verilog does not allow concurrent task calls.
Readability
This is more a matter of coding style and experience than language feature. VHDL is a
concise and verbose language; its roots are based on Ada. Verilog is more like C because it's
constructs are based approximately 50% on C and 50% on Ada. For this reason an existing C
programmer may prefer Verilog over VHDL. Although an existing programmer of both C
and Ada may find the mix of constructs somewhat confusing at first. Whatever HDL is used,
when writing or reading an HDL model to be synthesized it is important to think about
hardware intent.
Structural replication
Verilog. There is no equivalent to the generate statement in Verilog.
Test harnesses
Designers typically spend about 50% of their time writing synthesizable models and the other
50% writing a test harness to verify the synthesizable models. Test harnesses are not
restricted to the synthesizable subset and so are free to use the full potential of the language.
VHDL has generic and configuration statements that are useful in test harnesses, that are not
found in Verilog.
Verboseness
Verilog. Signals representing objects of different bits widths may be assigned to each other.
The signal representing the smaller number of bits is automatically padded out to that of the
larger number of bits, and is independent of whether it is the assigned signal or not. Unused
bits will be automatically optimized away during the synthesis process. This has the
advantage of not needing to model quite so explicitly as in VHDL, but does mean unintended
modeling errors will not be identified by an analyzer.
CHAPTER-6: CADENCE
CADENCE
6.1 CADENCE TOOLS
The Cadence suite is a huge collection of programs for different CAD applications
from VLSI design to high-level DSP programming. The suite is divided into different
“packages,” and for VLSI design, the packages we will be using are the IC package and the
DSMSE package.
The Cadence toolset is a complete microchip EDA system, which is intended to
develop professional, full-scale, mixed-signal microchips and breadboards. The modules
included in the toolset are for schematic entry, design simulation, data analysis, physical
layout, and final verification. The strength of the Cadence tools is in its analog
design/simulation/layout and mixed signal verification and is often used in tandem with other
tools for RF and/or digital design/simulation/layout, where complete top-level verification is
done in the Cadence tools. Another important concept is that the Cadence tools only provide
a framework for doing design. Without a foundry-provided design kit, no design can be
done.
Cadence Design Systems, Inc. (NASDAQ: CDNS), the leader in global electronic-design in-
novation, today said Global Unichip Corporation (GUC), a leading system-on-chip (SoC) de-
sign foundry, is the first Taiwan-based design company to complete a successful tape out of a
65-nanometer device. The success of this 65-nanometer tape out further strengthened GUC's
advanced technology capabilities to serve the top tier customers worldwide. GUC used the
Cadence(R) Low-Power Solution and SoC Encounter(TM) GXL RTL-to-GDSII system to
achieve the tape out.
6.2 ABOUT CADENCE COMPANY
Cadence enables global electronic-design innovation and plays an essential role in the
creation of today's integrated circuits and electronics. Customers use Cadence software and
hardware, methodologies, and services to design and verify advanced semiconductors,
consumer electronics, networking and telecommunications equipment, and computer
systems. Cadence reported 2006 revenues of approximately $1.5 billion, and has
approximately 5,200 employees. The company is headquartered in San Jose, Calif., with sales
offices, design centers, and research facilities around the world to serve the global electronics
Since ours is digital designing the tools used in our project are:
CADENCE
IUS - Incisive Unified Simulator.
RC-RTL Compiler.
SOC encounter-System On Chip encounter.
This tool work on 18nanometer technology.
Now, we will study in detail about these tools and results of our project using these tools.
FLOW OF DESIGN USING CADENCE TOOLS
6.3 INCISIVE UNIFIED SIMULATOR
Incisive Unified Simulator is a tool used to simulate digital circuits. The designs are
represented using many different languages such as Verilog or VHDL. IUS supports those language as well
additional languages used for specialized verification functions, such as SystemC, a derivative of C++. The tool
handles any design that can be represented using a digital representation with the key languages. The Verilog
only environment is called NC-Verilog and the VHDL one is called NC-VHDL. Designers depending on the
complexity of their simulation tasks will create environments that use multiple languages to perform advanced
verification tasks.
Who needs IUS:–System architects who need to do analysis on various scenarios to determine what the
right grouping of components would be. This is typically done with simple IP models to look at high level
behavior.–Design engineers who are creating the various parts of the circuit use IUS to test the behavior and
make sure the requirements are met–Verification Engineers are a specialized team that take the design once it is
completed and create test that exercise the complete design testing actual conditions as best as possible. –IP
vendors used IUS to create IP models and ensure that their models behaves correctly with the tools that their
customers will use.–Board designers will use IUS as means
BENEFITS
• Speeds time-to-market with lower risk and higher predictability.
• Increases productivity by enabling verification to start months earlier, before test bench development and
simulation.
• Improves quality and reduces risk of re-spins by exposing corner-case functional bugs that are difficult or
impossible to find using conventional methods.
• Reduces block design effort and debug time, and shortens integration time .
• Provides design teams with an advanced debug environment with simulation synergies for ease-of adoption.
• Offers the ultimate simulation-based speed and efficiency.
• Increases RTL performance by 100 times with native transaction-level simulation and optional Acceleration-
on-Demand.
• Reduces test bench development up to 50% with transaction-level support, unified test generation, and
verification component re-use.
• Shortens verification time, finds bugs faster, and eliminates exhaustive simulation runs with dynamic assertion
checking.
• Decreases debug time up to 25% through unified transaction/signal viewing, HDL analysis capability, and
unified debug environment for all languages.
This program is a front-end to some of the other tools in this directory. Its job is to compile, elaborate,
and launch the simulation.
ncvlog
This is the Verilog compiler. Typing this command in with no arguments gives you a listing of the possible
options. Two arguments which are useful are -cdslib, which specifies the location of your cds.lib file, and -
work, which specifies the location of your worklib file.
To compile a Verilog file named test.v, and its test bench named tb_test.v, using a cdslib of cds.lib, and a work
library of worklib, you can execute the following command:
ncvlog -cdslib cds.lib -work worklib test.v tb_test.v
ncelab
This is the elaborator. Again, typing this command with no arguments outputs a list of options. The above -
cdslib and -work arguments apply.
To elaborate an compiled test bench called tb_test.v, execute the following command:
ncelab -cdslib cds.lib -work worklib worklib.tb_test_v
ncsim
This is the actual simulator.
To launch a compiled and elaborated test bench, execute the following command:
ncsim -gui -cdslib cds.lib -work worklib worklib.tb_test_v:module
CHAPTER- 7: VERILOG CODING
MODULE - CONTROL UNIT
`resetall`timescale 1 ns / 1 ns`viewmodule cu (inp1, inp2, s, refclk, result, carry);input [15:0] inp1, inp2;input [3:0] s;input refclk;output [31:0] result;output carry;reg [3:0] opc;reg rw, um;wire [31:0] res;wire car, clk, spi_clk;wire [15:0] data;assign result = res; assign carry = car;alu1 alu_inst (inp1, inp2, opc, rw, um, res, car);always @ (posedge refclk or rw or um or opc)beginopc = { s[3], s[2], s[1], s[0]};if (um == 1)beginrw = 1;endelse if (um == 0)beginrw = 1;endendendmodule`noview
MODULE – ALU`resetall`timescale 1 ns / 1 ns`viewmodule alu (in1, in2, s, rw, um, out, cout); input [15:0] in1,in2;input [3:0] s;input rw , um;output [31:0] out;output cout;wire [31:0] out;
wire [15:0] out22, out33, out77;wire [31:0] out88;wire cout1;wire [15:0] datout;wire [1:0] ss;reg [15:0] inn1, inn2, int, int1;reg [15:0] out1,out3,out4,out5,out6,out7,out8,out9,out10;reg [31:0] out2;reg [15:0] m1,m2;reg cc, en, e, enn;assignout = s[3] ? ( s[2] ? (s[1] ? (s[0] ? out10 : out9): out8 ): out8): ( s[2] ? ( s[1] ? ( s[0] ? out7:out6 ): ( s[0] : out5: out4)): ( s[1] ? (s[0] ? out3 : out2) : out1));assign cout = cout1;add16 adder16bit_inst (in1, inn2, cc, enn, out22, cout1);mul multiplication_inst (m1, m2, enn, out88);cmpr comparator_inst (in1, int, e, ss, out77); mem memory_inst (in2, in1, rw, en, datout);always @ ( in1 or in2 or s or rw)beginen = 1;if (um == 1);int = datout;elseint = in2;inn2 <= {16{ s[0]}} ^ int;cc <= s[0];e <= 1;m1 = { in1[15], in1[14], in1[13] , in1[12], in1[11], in1[10], in1[9], in1[8], in1[7], in1[6], in1[5], in1[4], in1[3], in1[2], in1[1], in1[0]};m2 = {int[15], int[14], int[13], int[12], int[11], int[10], int[9], int[8], int[7], int[6], int[5], int[4], int[3], int[2], int[1], int[0]};enn <= 1;out6 = ~in1;out2 = out88;out3 = in1 | int;out1 = out22;out4 = in1 & int;out5 = in1 ^ int;out7 = out33;out8 = out77;out9 = in1;out10 = int;endendmodule`noview
MODULE – 16 BIT ADDER
`resetall`timescale 1 ns / 1 ns`viewmodule add16(a,b,c_in,Enn,sum,cout);input [15:0] a,b;input c_in;input Enn;output [15:0] sum;output cout;reg [15:0] p,g;reg [16:0] carry;reg [15:0]s;integer i;assign sum = s;assign cout= carry[16];always @ (Enn)begin p= (a ^ b);g = (a & b);carry[0] = c_in;
carry[1] = (g[0] | (p[0] & c_in));
carry[2] = (g[1] | (p[1] & g[0]) | (p[1] & p[0] & c_in));
carry[3] = (g[2] | (p[2] & g[1]) | (p[2] & p[1] & g[0]) | ( p[2] & p[1] & p[0] & c_in ));
carry[4] = (g[3] | (p[3] & g[2]) | (p[3] & p[2] & g[1]) | (p[3] & p[2] & p[1] & g[0]) | ( p[3] & p[2] & p[1] & p[0] & c_in));
carry[5] = (g[4] | (p[4] & g[3]) | (p[4] & p[3] & g[2]) | (p[4] & p[3] & p[2] & g[1]) | (p[4] & p[3] & p[2] & p[1] & g[0]) | (p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[6] = (g[5] | (p[5] & g[4]) | (p[5] & p[4] & g[3]) | (p[5] & p[4] & p[3] & g[2]) | (p[5] & p[4] & p[3] & p[2] & g[1]) | (p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[7] = (g[6] | (p[6] & g[5]) | (p[6] & p[5] & g[4]) | (p[6] & p[5] & p[4] & g[3]) | (p[6] & p[5] & p[4] & p[3] & g[2]) | (p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[8] = (g[7] | (p[7] & g[6]) | (p[7] & p[6] & g[5]) | ( p[7] & p[6] & p[5] & g[4]) | ( p[7] & p[6] & p[5] & p[4] & g[3]) | (p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[9] = (g[8] | (p[8] & g[7]) | ( p[8] & p[7] & g[6]) | (p[8] & p[7] & p[6] & g[5]) | (p[8] & p[7] & p[6] & p[5] & g[4]) | (p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[10] =( g[9] | (p[9] & g[8]) | ( p[9] & p[8] & g[7]) | (p[9] & p[8] & p[7] & g[6]) | (p[9] & p[8] & p[7] & p[6] & g[5]) | (p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[4] & p[3] & p[2] & g[1]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[11] = (g[10] | (p[10] & g[9]) | (p[10] & p[9] & g[8]) | (p[10] & p[9] & p[8] & g[7]) | ( p[10] & p[9] & p[8] & p[7] & g[6]) | (p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));carry[12] = (g[11] | (p[11] & g[10]) | (p[11] & p[10] & g[9]) | (p[11] & p[10] & p[9] & g[8]) | ( p[11] & p[10] & p[9] & p[8] & g[7]) | (p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));carry[13] = (g[12] | (p[12] & g[11]) | ( p[12] & p[11] & g[10] ) | (p[12] & p[11] & p[10] & g[9] ) | ( p[12] & p[11] & p[10] & p[9] & g[8]) | (p[12] & p[11] & p[10] & p[9] & p[8] & g[7] ) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6] ) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | ( p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p [6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[14] = (g[13] | (p[13] & g[12] ) | (p[13] & p[12] & g[11]) | (p[13] & p[12] & p[11] & g[10]) | (p[13] & p[12] & p[11] & p[10] & g[9]) | (p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[13] &
p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
carry[15] = (g[14] | (p[14] & g[13]) | (p[14] & p[13] & g[12] ) | (p[14] & p[13] & p[12] & g[11]) | (p[14] & p[13] & p[12] & p[11] & g[10]) | ( p[14] & p[13] & p[12] & p[11] & p[10] & g[9] ) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6] ) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[ 6] & p[5] & p[4] & g[3]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p [3] & p[2] & p[1] & g[0]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));carry[16] = (g[15] | (p[15] & g[14]) | (p[15] & p[14] & g[13]) | (p[15] & p[14] & p[13] & g[12]) | (p[15] & p[14] & p[13] & p[12] & g[11]) | (p[15] & p[14] & p[13] & p[12] & p[11] & g[10]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & g[9]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
for( i =0 ;i< 16 ; i=i+1)s[i] =(p[i] ^ carry[i]);end endmodule
MODULE – MULTIPLICATION`resetall`timescale 1 ns / 1 ns`viewmodule mul(a,b,mult);input [15:0] a,b;output [31:0] mult;integer i,j;
reg [15:0] prod1,prod,sum;reg carry;reg [31:0] m;assign mult = m;always @ (a or b)begincarry = 0;for (j=0;j<16;j=j+1)prod1[j]= a[0]& b[j];m[0] = prod1[0];prod1 = prod1 >> 1;for (i=1;i<16;i=i+1)beginfor (j = 0;j<16;j =j+1)prod[j] = a[i] & b[j];for(j=0;j<16;j=j+1)beginsum[j] = prod[j] ^ prod1[j] ^ carry;carry = ( carry & prod[j]) | ( carry & prod1[j])| ( prod[j] & prod1[j]);endm[i]= sum [0];prod1 = sum >> 1;prod1[15] = carry;endfor(i=0;i<16;i=i+1)beginm[i+16] = prod1[i];endendendmodule
MODULE - COMPARATOR
`resetall`timescale 1 ns/ 1 ns`viewmodule cmpr(a,b,en,cmp,grt);input [15:0] a,b;input en;output [1:0] cmp;output [15:0] grt;reg [15:0] x,ar,br,grtr;integer i;reg eq,gr,ls;reg [15:0] c,d;reg [1:0] compr;assign cmp = compr;
assign grt = grtr;always @ (en)beginar = a;br = b;for ( i= 15; i > -1; i = i-1)beginx[i] = ( a[i] & b[i]) | (( ~a[i]) & (~b[i]));c[i] = ~b[i];d[i] = ~a[i];endeq = x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & x[1] & x[0];gr = (a[15] & c[15]) | (x[15] & a[14] & c[14]) | (x[15] & x[14] & a[13] & c[13]) | (x[15] & x[14] & x[13] & a[12] & c[12]) | ( x[15] & x[14] & x[13] & x[12] & a [11] & c [11]) | (x[15] & x[14] & x[13] & x[12] & x[11] & a[10] & c[10]) | ( x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & a[9] & c[9]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & a[8] & c[8]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & a [7] & c[7]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & a[6] & c[6]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & a[5] & c[5]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & a[4] & c[4]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & a[3] & c[3]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] &x[5] & x[4] & x[3] & a[2] & c[2]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & a[1] & c[1]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] &x[1] & a[0] & c[0]);ls = (d[15] & b[15]) | (x[15] & d[14] & b[14]) | (x[15] & x[14] & d[13] & b[13]) | (x[15] & x[14] & x[13] & d[12] & b[12]) | ( x[15] & x[14] & x[13] & x[12] & d [11] & b [11]) | (x[15] & x[14] & x[13] & x[12] & x[11] & d[10] & b[10]) | ( x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & d[9] & b[9]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & d[8] & b[8]) | (x[15] & x[14] &x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & d [7] & b[7]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7]& d[6] & b[6]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & d[5] & b[5]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & d[4] & b[4]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & d[3] & b[3]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] &x[5] & x[4] & x[3] & d[2] & b[2]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & d[1] & b[1]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] &x[1] & d[0] & b[0]);if (eq == 1)begincompr = 2'b00;grtr = ar;endelse if (gr == 1)begincompr = 2'b01;grtr = ar;end
else if ( ls == 1)begincompr = 2'b10;grtr = br;endelsecompr = 2'b11;endendmodule
MODULE – MEMORY`resetall`timescale 1 ns/ 1 ns`viewmodule mem(addr,datain,rw,en,dataout);input [15:0] datain,addr;output [15:0] dataout;input rw,en;reg [65535:0] mem1 [15:0];reg [15:0] dat;assign dataout = dat;always @ (en)beginif(rw==1)dat = mem1[addr];else if(rw==0)beginmem1[addr] = datain;dat = datain;enddat= datain;endendmodule`noview
CHAPTER- 8: RESULTS