An Educational Processor-A Design Approach_298

download An Educational Processor-A Design Approach_298

of 40

Transcript of An Educational Processor-A Design Approach_298

  • 7/31/2019 An Educational Processor-A Design Approach_298

    1/40

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    An Educational Processor: A Design Approach

    Raed Alqadi, & Luai Malhis

    Department of Computer Engineering, Faculty of Engineering, An-Najah

    National University, Nablus, Palestine.

    E-mail: [email protected]

    Received: (5/1/2005), Accepted: (7/3/2006)

    Abstract

    In this paper, we present an educational processor developed at An-Najah

    National University. This processor has been designed to be a powerful

    supplement for computer architecture and organization courses offered at theuniversity. The project is intended to develop an easily applicable methodology

    by which students get a valuable experience in designing and implementing

    complete processor with simple readily available off-the-shelf components. The

    proposed methodology is beneficial to computer engineering students enrolled

    at universities, specially in developing countries, where advanced laboratory

    equipments are rarely available. The design philosophy is to build a general-

    purpose processor using simple and wide spread integrated circuits. This

    methodology includes: defining an instruction set, datapath layout, ALU

    implementation, memory interface, controller design and implementation. For

    testing and evaluation purposes, a debugging tool is also built and implemented

    to serially connect the processor to a personal computer. In this paper, wepresent the methods and components used in the design and implementation

    phases and the tools developed to complete the design process. This

    methodology has shown that students enrolled in the computer architecture

    course get the full experience in processor design without the need for advanced

    laboratory equipments. The components used are cost efficient and methods

    proposed allow students to work at home, hence, this methodology has proven

    to be cost effective and yet very educational.

    Key Words: Educational Processor, RISC Architecture, Computer

    Architecture, Instruction Set, Controller Design, Monitor Program.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    2/40

    2 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    .

    .

    .

    1. Introduction

    Computer architecture, computer organization, and microprocessors,

    are typical courses taught in computer/electrical engineering and

    computer science departments. In these courses, a detailed study of

    processor instruction set and design is investigated thoroughly. In acomputer architecture course, students are mainly theoretically

    introduced to the design and implementation of an educational processor

    like the DLX processor(9). Students can get a great experience in the

    processor design methodologies if they are asked to complete a practical

    processor design project. In this paper we propose a new processor that

    we call Educational Processor Unit (EPU), and a new design

    methodology that could be used for educational purposes.

    The processor we propose has been designed and developed at An-

    Najah National University as a supplement to a computer architecture

  • 7/31/2019 An Educational Processor-A Design Approach_298

    3/40

    Raed Alqadi, & Luai Malhis 3

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    course. The architecture of this processor has been chosen to assimilate

    the current trends in processor design. The EPUs instruction set is

    designed to follow the reduced instruction set architecture (RISC)(1,9)

    philosophy. Next, we give an overview of the EPU and the designmethodology adapted.

    The instruction set for this processor is complete and general. It

    provides a good example on the design of general-purpose processors.

    The design methodology is driven by the fact that advanced IC

    components such as ASCIs, CPLDs, FPGAs, and sometimes GALs, in

    addition to advanced laboratory equipments may not be available at the

    universities of developing countries. Hence, we based our design

    methodology on using basic integrated circuit (IC) components that are

    readily available at affordable prices. In fact, we will rely on common

    components such as ROMs, PALs, RAMs, and common MSI logiccircuits. The authors also believe that an efficient method for improving

    teaching computer-hardware related courses is to develop your own tools

    for designing, debugging and testing purposes. Therefore, to facilitate the

    EPU design, we have developed software tools such as: ALU generator,

    controller generator, assembler, and tools for an interface circuit that

    connects the EPU to a personal computer. The interface circuit facilitates

    data transfer between the EPU and the personal computer as well as

    simplifying the testing and debugging process. A monitor program has

    also been developed that can be loaded into the memory of the EPU. The

    monitor has been designed for the purposes of loading user programs into

    EPUs memory, getting a dump of the EPUs memory and executing aloaded program. Consequently, this methodology allows students to

    complete their projects at home with minimal cost and with no need for

    advanced laboratory equipment.

    With these objectives in mind, we propose the design of a new

    processor that will certainly enhance the students knowledge in this

    subject and yet is cost-effective. This methodology is very practical such

    that students are able to build complete processors at minimal cost. The

    experience they will go through will make them very knowledgeable in

    all phases of processor design. The tools they will build will provide a

  • 7/31/2019 An Educational Processor-A Design Approach_298

    4/40

    4 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    good example in interfacing hardware components with a personal

    computer. Proposing and developing new teaching methodologies to aid

    students in their courses is popular in worldwide-known universities and

    many papers with similar interests have been proposed. See forexample(1-7, 10).

    A detailed discussion of the processor proposed, methodology

    adapted, components used, and the communication software developed

    are presented in the following sections. Section 2 describes the

    instruction set chosen for this processor. Section 3 discusses the datapath

    including register file and other datapath components. The ALU design

    and its generator are also presented in this section. Section 4 contains a

    detailed discussion of the control unit and the timings of the control

    signals needed to control the datapath execution. A software tool for

    generating the control signals values is also presented. Section 5includes a discussion of interfacing the EPU serially to a PC

    accompanied with a monitor program used to debug the EPU and to

    download and execute user programs. Final thoughts and concluding

    remarks are presented in section 6.

    2. The Processor and Its Instruction Set

    The processor we propose is a 16-bit general purpose machine with

    features of recent processor design methodologies. Since recent trends in

    computer architecture tend to use RISC technology, we decided to use

    this trend because of its popularity in computer architecture courses as

    well as its simplicity in hardware design. Since the main objective of thispaper is to show a practical and applicable processor deign methodology,

    one could argue that 8-bit machine would be sufficient. However, we

    decided to use 16-bit architecture to show that the method is easily

    applicable to design and implement a 16-bit processor. The first step in

    processor design is to select the instruction format(s) and the instruction

    set. Modern machines use uniform, orthogonal instruction set and format;

    hence our instruction set follows this scheme. We have decided to use

    three formats which are 16-bits in size as shown in Figure 1. Any

  • 7/31/2019 An Educational Processor-A Design Approach_298

    5/40

    Raed Alqadi, & Luai Malhis 5

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    instruction in the instruction set can be placed into one of three

    categories: Immediate, Register or Jump.

    Immediate (register) type instruction requires performing an ALU

    operation on two operands one is a register (source and destination) and

    the other is an immediate value (a register). Jump (conditional /

    unconditional) is relative to current PC and an 11-bit signed value (-1024

    to + 1023) is added to the PC.

    Immediate Opcode(5 bits) RD Literal (8-bit)

    Register Opcode(5 bits) RD RS

    Jump Opcode(5 bits) Address (11 bits)

    Figure (1): Instruction Formats

    The 16 bits of an instruction is divided into the following parts:

    1. Opcode: Operation code, 5 bits, which means that up to 32instructions can be defined in the instruction set.

    2. RD: Destination register and source of first operand, 3 bits.3. RS: Source register of second operand, 3 bits.4. Literal: Signed immediate data, 8 bits.5. Address: Jump and conditional jump, 11 bits. The address is relative

    to the program counter.

    In our machine, eight 16-bit registers will be used. Register R7 is

    used as program counter and will also be visible to programmers so that

    it can be used to implement some instructions such as call, return, and

    jump to an address specified by a register. Also, one register will be used

    as a stack register but that is left to the programmer to choose.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    6/40

    6 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Throughout this paper, we will use R6 for stack operations to implement

    pseudo instructions such as push, pop, call and return. The complete

    instruction set is shown in Table 1. Even though there are only 30

    instructions, almost any program can be written using this instruction set.

    Table (1): Instruction Set

    No Opcode Instruction Operation

    0 00000 LU RD, L Load upper byte of RD with L, old upper

    byte is moved to lower byte.

    1 00001 LL RD, L Load lower byte of RD with L

    2 10000 LW RD, [RS] Store RD in memory at address specified

    by RS

    3 11000 SW [RS], RD Load RD from memory at addressspecified by RS

    4 11001 MOV RD, RS Move RS to RD

    5 00010 ADDL RD, L ADD L to RD

    6 00011 SUBL RD,L SUB L from RD

    7 00100 ANDL RD,L AND RD with L

    8 00101 ORL RD,L OR RD with L

    9 00110 XORL RD,L XOR RD with L

    10 00111 CMP RD, L Compare RD with L

    11 10010 ADD RD, RS ADD RS to RD

    12 10011 SUB RD,RS SUB RS from RD

    13 10100 AND RD,RS AND RD with RS

    14 10101 OR RD,RS OR RD with RS

    15 10110 XOR RD,RS XOR RD with RS

  • 7/31/2019 An Educational Processor-A Design Approach_298

    7/40

    Raed Alqadi, & Luai Malhis 7

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Continue table (1)

    No Opcode Instruction Operation

    16 10111 CMP RD, RS Compare RD with RS

    17 11010 ROL RD Rotate left one bit, MSB goes to Carry

    18 11011 ROR RD Rotate right one bit

    19 11100 SHL RD Shift left one bit, MSB goes to Carry

    20 11101 SHR RD Shift right one bit

    21 11111 SAR RD Shift Arithmetic right one bit

    22 01000 JMP address Jump to Address

    23 01001 JZ address Jump if Zero to Address

    24 01010 JNZ address Jump if not Zero to Address

    25 01011 JC address Jump if Carry to Address

    26 01100 JNC address Jump if not Carry to Address

    27 01101 JS address Jump if sign is set to Address.

    28 01110 JNS address Jump if not sign to Address

    29 10001 SWAP RD,RS RD gets the bytes of RS but the low byteof RS is swapped with upper byte of RS

    before loading into RD

    To keep the design simple so that it can be easily implemented by

    students, some instructions will be defined as pseudo instructions. Pseudo

    instructions are implemented by using a combination of few instructions

    from the original instruction set. Table 2 shows some of these

    instructions. Note that original instruction set is general and sufficient to

    implement any program.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    8/40

    8 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Table (2): Pseudo Instructions

    No Pseudo Instruction Sequence of Instructions

    1 LI RD, 16-bit LU RD, L; ; Load Upper 8-bits of RDLL RD, L ; Load Lower 8 bits of RD

    2 CALL address SW [R6], R7 ; Push PC on stack

    ADD R6, 1 ; Increment stack pointer

    JMP address ; Go to subroutine

    3 RET SUB R6,1

    LW R7,[R6]

    4 PUSH RD SW [R6], RD

    ADD R6,1

    5 POP RD SUB R6,1

    LW RD, [R6]

    The next section discusses the datapath that implements these

    instruction and components used in the implementation.

    3. Component Organization and Layout

    Figure 2 shows the components that make up the datapath for our

    EPU. As mentioned above, the register file consists of 8 registers of

    which R7 is used to implement the program counter (PC). As shown in

    the figure the datapath consists of the following components: special-

    purpose registers (IR, MAR, FLAGS and PC (R7)), a set of general

    purpose registers (total of 7 R0-R6 in the register file), multiplexers, asign-extension, SWAP, PC select (OR) and memory units. All registers

    are 16 bits in size. These components are discussed in details in the

    following subsections.

    3.1Registers, Program Counter and MultiplexersIn this subsection, we describe the main components of the datapath

    shown in Figure 2 which includes: IR, Register File, Multiplexer, MAR,

    Memory, Flags, OR, Swap, and the Sign Extension components. The IR

    holds the instruction fetched from memory. The MAR holds the operand

  • 7/31/2019 An Educational Processor-A Design Approach_298

    9/40

    Raed Alqadi, & Luai Malhis 9

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    S

    R

    O

    P

    CO

    D

    E

    D

    R

    J

    M

    P

    A

    D

    DR

    E

    SS

    (PCS)PC

    select

    Register

    File

    AL

    U

    Sign

    Ext

    (SXT) Sign Extend

    Data

    RF_WR

    IR_WR

    Memory

    Address

    MAR

    Data bus

    MAR_WR

    ALU_OP

    MEM_RD

    MEM_WR

    ALU_MUX

    ALU_EN

    OR

    IR

    A

    B

    RS Data

    RD Data

    ALU_EN and

    MEM_RD can not

    be active at the

    same time.

    Shared bus (16-bit)

    Z C SFLAG_WR

    SWAP

    ALU

    MU

    X

    S

    WA

    P

    RS/RD Select

    address to be loaded (stored) from (into) memory. Each of these two

    registers is implemented by using two 74LS374 ICs*. The register file is

    a two-port register file with RS and RD fields (3 bits each) connected to

    the address lines to select the corresponding data register at the RS andRD output data ports. The RD field also selects the destination register.

    Figure (2): Datapath

    *Even though we use LS in our description S, LS, ALS, F, and HCT technologiescan be used.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    10/40

    10 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    The register file can be implemented by using specialized register

    file ICs or by using common registers and decoders. We used the second

    option because our objective is educational and we believe that it is

    important to let the students design and build every major component of

    the EPU. The hardware needed to implement these components is

    significant and is shown in Figure 3. This is because the design

    methodology dictates using readily available and easy to program ICs. In

    the design of the register file, it has been found that there is a problem in

    the number of Tri-State buffers needed which is 32 ICs. Therefore, to

    reduce the number of ICs (mainly the tri-state buffers) we have alteredthe register file implementation as shown in Figure 4. Such

    implementation uses much less hardware, but requires special attention to

    timing. The main difference between the register file shown in Figure 3

    and the one shown in Figure 4 is that the second requires two cycles to

    output the RS and RD ports whereas the first one requires one cycle.

    Note that in this design we have used the internal Tri-state buffers of the

    74LS374 registers.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    11/40

    Raed Alqadi, & Luai Malhis 11

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Figure (3): Possible Implementation of the Register File

  • 7/31/2019 An Educational Processor-A Design Approach_298

    12/40

    12 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Figure (4): Register File Used in Our Implementation

  • 7/31/2019 An Educational Processor-A Design Approach_298

    13/40

    Raed Alqadi, & Luai Malhis 13

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    74LS244

    124681

    9

    1

    1

    1

    3

    1

    5

    1

    7

    1

    8

    1

    6

    1

    4

    1

    2

    9753

    1

    G

    1

    A

    1

    1

    A

    2

    1

    A

    3

    1

    A

    4

    2

    G

    2

    A

    1

    2

    A

    2

    2

    A

    3

    2

    A

    4

    1

    Y

    1

    1

    Y

    2

    1

    Y

    3

    1

    Y

    4

    2

    Y

    1

    2

    Y

    2

    2

    Y

    3

    2

    Y

    4

    74LS244

    124681

    9

    1

    1

    1

    3

    1

    5

    1

    7

    1

    8

    1

    6

    1

    4

    1

    2

    9753

    1

    G

    1

    A

    1

    1

    A

    2

    1

    A

    3

    1

    A

    4

    2

    G

    2

    A

    1

    2

    A

    2

    2

    A

    3

    2

    A

    4

    1

    Y

    1

    1

    Y

    2

    1

    Y

    3

    1

    Y

    4

    2

    Y

    1

    2

    Y

    2

    2

    Y

    3

    2

    Y

    4

    1

    212

    3

    12

    3

    12

    3

    D7D8D9

    D10

    Output bits D8--D15 SXT Control0 Sign Extend 8 bits

    1 Sign Extend 11 bits

    PCS

    RD Field (3-bits)

    (b) SXTCircuit(a) OR Circuit

    PCS=1,Out=R7=PC codePCS=0,Out=RD code

    The PC Select (OR) Unit: The OR component consists of three OR

    gates to make the output 7 to select the PC (R7) or pass the RD field, as it

    will be made clear later. This will be required to increment the PC, and

    also in a jump instruction. When the PC select signal is asserted(1)

    theoutput will be 7(PC), when the PCS signal is 0 the output is the RD field.

    Figure 5(a) illustrates the OR component.

    Figure (5): OR and SXT Circuit

  • 7/31/2019 An Educational Processor-A Design Approach_298

    14/40

    14 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    I1 I0 I1 I0

    D0-D7 D0-D7D8-D15 D8-D15

    8 8 88

    D0-D7D8-D15

    88

    SWAPMUXMUX

    SelectSelect

    Sign Extend Unit: The circuit of this unit is shown in Figure 5(b). It

    has a sign extend component that will be used to perform either sign

    extend 8-bit literal value to 16-bits or sign extend 11-bit literal value to

    16 bits. We need to sign extending 8-bit literal value to 16-bit ininstructions where one of the operands is literal. We need to sign

    extending 11-bit literal value to 16 bits for jump instructions.

    The ALU MUX: This multiplexer is needed to select either the

    output of the SXT unit or the source register data (RS) for the second

    ALU operand. A single bit control signal (ALU_MUX) is needed for this

    purpose. This multiplexer is also implemented by using tri-state buffers

    (74LS244) ICs.

    The Swap Circuit: This circuit either passes the output of the ALU

    MUX as it is or swaps the lower byte with the upper byte. The swapping

    is used to implement the LU and the SWAP instructions. Only in thesetwo instructions the SWAP signal is asserted, while for all other

    instructions the swap circuit simply passes its inputs. The swap circuit is

    implemented by using two multiplexers. Each multiplexer has two sets of

    inputs where each set takes 8 bits (either the lower 8 bits or the upper 8

    bits) see Figure 6 for details.

    Figure (6): The Swap Circuit

  • 7/31/2019 An Educational Processor-A Design Approach_298

    15/40

    Raed Alqadi, & Luai Malhis 15

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    A0 -A3B0 -B3

    Cout

    Zout Stage 1

    Cin

    A4 A7B4 B7

    Cout

    Zout Stage 2

    Cin

    Zin

    A8 A11B8 B11

    Cout

    Zout Stage 3

    Cin

    Zin

    A12 A15B12 B15

    SR

    I

    Stage 4

    Cin

    Zin

    A0

    A8

    SRI SRI SRI

    A12 A4

    C Z S

    Cout

    Zout

    A15

    A15

    F12 F15 F8 F11 F4 F7 F0 F3Flags (Flip Flops)

    Used for

    Rotate Left

    Used

    forRotate

    Right

    Note: SRI is used for shift and rotate right

    3.2. The Arithmetic and Logic Unit

    In this section, we present a technique for building custom ALUs by

    using EPROMs/EEPROMs, even though it is possible to build the ALU

    by using components such as 74181 or similar ICs. We believe it is

    better to let the students design ALUs from components such as GALs,

    PALs, or EPROMs. Here, we will describe building 16-bit ALU by using

    our software tool that generates EPROM programming files. By using

    this method, we can readily build custom ALUs from popular EPROMs

    such as 2764, 27128, 27256, and 27512.

    Figure (7): Block Diagram of Four Stages ALU

  • 7/31/2019 An Educational Processor-A Design Approach_298

    16/40

    16 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Figure 7, shows a 16-bit ALU built using four stages where each

    stage takes 4 bits of a16-bit operand. Each stage corresponds to one

    EPROM IC. Note that the address bus is used for inputs and data bus is

    used for the outputs. The set of ALU operations required must satisfy therequirement of the instruction set. The following set of operation is

    generally sufficient for a wide range of instruction sets, recall that A is

    connected to RD (destination register) and B is connected to the swap

    unit, which passes through or swaps either an RS or sign-extended literal.

    1. F=A AND B.2. F=A OR B.3. F=A XOR B.4. F=A+B ; ADD5. F=A-B ; SUB6. F=A+1. ; INC7. F=A. ; PASSA8. F=B ; PASSB9. F=A ROL 110.F=A ROR 111.F = A SHL 112.F = A SAR 113.F = A SHR 114.F = [A15:A8]:[B7:B0] ; needed for LL instruction15.F = [B15 :B7]:[A7:A0]; needed for LU instruction

    The following algorithm is used to generate the program files for the

    four 27512 EPROMs. This algorithm is designed for a 4-stage ALU but

    can easily be modified for ALUs with more stages.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    17/40

    Raed Alqadi, & Luai Malhis 17

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Algoritm ALUGenStage (int Stage)

    { Assign the bits for A, B, Cin, Zin and SRI to the Address input.

    Address = inputs = A,B, Zin, SRI

    Open Binaryfile

    FOR address =0 to Address = 2(No. of Inputs) {

    // extract the operands from the address bits, and shift them

    appropriately

    A = [AD3:AD0] ; bits 03 of address

    B = [AD7:AD4]>>4 ; bits 47 of address

    Cin=[AD8]>>8;

    SRI = [AD9] >>4; // Put bit in fifth position

    OP = [AD13:AD10]>>10;

    Zin =AD[14] >>14;

    Let F be the output of 5 bits where the fifth bit is the Carry (Cout)

    switch(OP){

    case ADD: if (stage == 1) F = A + B; else F = A + B +Cin;

    case SUB: if (stage == 1) F = A -B; else F = A - B - Cin;

    case INC: if (stage == 1)F = A +1 ; else F = A + Cin;

    case AND: F = A & B;

    case OR: F = A | B;

    case XOR: F = A ^ B ;

    case PASSA: F =A

    case PASSB: F = B;

    case ROL: F = A

  • 7/31/2019 An Educational Processor-A Design Approach_298

    18/40

  • 7/31/2019 An Educational Processor-A Design Approach_298

    19/40

    Raed Alqadi, & Luai Malhis 19

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    F = F|Zout;

    //Result in F=F4:F0 is the result of operation, F5 is Cout and

    F6is the Zout

    //Store the output F byte to file.

    }

    4. The Control Unit

    Because of the simplicity of the hardwired design and for educational

    purposes, we choose to implement the control by unit using the

    hardwired control methodology. This unit is abstracted as a finite state

    machine. In this abstraction, the control unit transits through a finite set

    of states and is responsible for asserting and un-asserting all controlsignals necessary for the datapath to function properly. The control unit

    simply remembers the current state of the system and then based on both,

    the current state and the set of input parameters (instruction opcode), the

    next state is determined. In addition, necessary control signals for the

    current state are also asserted. This process is repeated for all instructions

    in the instructions set.

    4.1 EPU Control

    The control signal needed to control the various activities of the

    datapath components are shown in Figure 2. These control signals are

    specific to the type of components chosen for our datapath design.Control signal names and values may change if the datapath components

    change. Nevertheless, we based our design on using the most primitive

    off-the-shelf components that are available at affordable prices. The

    control signals of our datapath components are described in more details

    in Table 3. Note that IR_WR and RD/RS-Select use the same control

    signal. This fact will be illustrated in the timing analysis described later.

    Also notice that the ALU_EN and the RAM_RD are complements of

    each other. The ALU_EN is connected to the Output Enable (OE) of the

    EPROMs that implement the ALU.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    20/40

    20 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Table (3): Control Signal Names and Functions

    Control signal Abbreviation Description

    Register File Write RF_WR Write to Destination Register

    Instruction Register

    Write

    IR_WR

    Also RS/RD

    Select

    Write Instruction to Instruction

    Register

    (Selects RD/RS in register file)

    MAR Register Write MAR_WR Write to MAR Register

    MEM Read MEM_RD Enables reading from Memory

    MEM Write MEM_WR Enables writing to Memory

    ALU enable ALU_EN Enables the ALU Tri-state buffer

    output

    ALU MUX Selection ALU_MUX Operand select Control Line

    ALU operation ALU_OP (4-bits)The Specified ALU Operation

    PC Select PCS Selects PC(R7) or RD implemented

    by ORing PCS with RD (RD =

    111=PC)

    Sign Extend STX Either Sign Extends 8-bit literal or 11-

    bit address to 16-bit value

    Flags Write FLAG_WR Write the carry, Sign and Zero D-Flip

    Flops

    The first step in designing control is to categorize the instructions in

    the instruction set into groups such that all related instructions are placed

    in one group. All instructions in a given group transit through the same

    set of states in the state diagram when executed. Also, the same control

    signals are asserted and un-asserted for all instructions in the group. This

    makes the design much easier to manage. In addition, control signal

    values are determined by state bases rather than by instruction bases.

    Each instruction in the instruction set is placed into one of seven

    categories as shown in Table 4. The categorizing is based on the type of

    operation (ALU, Memory Load/Store, or Jump) and on whether it

    affects the FLAGS, or if branch taken or not.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    21/40

    Raed Alqadi, & Luai Malhis 21

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Table (4): Categorizing the Instruction Set

    Group A

    OP RD, RSOP RD,L

    (flags

    affected)

    Group B

    OP RD,RS

    OP RD,L

    (flags not

    affected)

    Group C

    OP RD,RS

    OP RD,L

    (Only

    Flags

    affected)

    Group D

    Taken jumps(Conditional

    And un-

    conditional)

    Groups E

    And FMemory

    Load and

    Store

    Group G

    Untakenjumps

    (Conditional)

    ADDL RD, L Group B1 CMP

    RD,L

    JMP address Group E JZ address

    SUBL RD,L LL RD, L CMP

    RD,RS

    JZ address LW

    RD,[RS]

    JNZ address

    ANDL RD,L MOVRD,RS

    JNZ address JC address

    ORL RD,L JC address JNC address

    XORL RD,L Group B2 JNC address JS address

    ADD RD,

    RS

    LU RD, L JS address Group F JNS address

    SUB

    RD,RS

    SWAP

    RD,RS

    JNS address SW [RS],

    RD

    AND

    RD,RS

    OR RD,RS

    XOR

    RD,RS

    ROL RD

    ROR RD

    SHL RD

    SAR RD

    SLR RD

  • 7/31/2019 An Educational Processor-A Design Approach_298

    22/40

    22 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Figure (7): The State Diagram

    The state diagram that comprises the states of the finite state machine

    that executes all instructions as categorized in Table 4 is shown in Figure

    7. The operations performed in each state and which control signals are

    asserted is shown in Table 5. In order to clarify the operations in each

    state, S3 has been shown as 4 sub-states, but it is actually one state.

    Although there are many possible implementations, we have chosen the

    above states to facilitate the control unit implementation. In fact, the

    technique will still work for different state diagrams.

    S31S3

    2S33S3

    4

    S4

    S2

    S5C

    S0

    S6AB

    D

    EF

    Fetch States

    Decode State

    S1

    S3

    G

    E,F

  • 7/31/2019 An Educational Processor-A Design Approach_298

    23/40

    Raed Alqadi, & Luai Malhis 23

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Table (5): The Operations Performed in Each State and Control Signals

    Asserted

    State Group OperationControl Signals (other signals arein their initial value, some signals

    are active Low)

    S0 ALL MARPC ALU_OP = PASS RD, ALU_EN,

    MAR_WR, PCS(PC SELECT)

    S1 ALL IRMEM[PC] MEM_RD, IR_WR, ALU_EN =

    DISABLED

    S2 ALL PCPC+1 ALU_OP = INC A, ALU_EN,

    RF_WR, PCS

    (PC SELECT)

    S31 A RD RD op (RS or L) ALU_OP = F(opcode) ALU_EN,

    RF_WR, FLAG_WR

    S32

    B RD RD op (RS or L) ALU_OP = F(opcode) ALU_EN,

    RF_WR

    S33

    C RD op (RC or L) ALU_OP = F(opcode)= subtract,

    ALU_EN, FLAG_WR

    S34

    D PCPC+Address(ST

    X)

    ALU_OP =F(opcode)=ADD,

    ALU_EN, RF_WR, PCS (PC

    SELECT)

    S4 E,F MARSR ALU_OP =PASS RS, ALU_EN,

    MAR_WR

    S5 E RDMEM[RS] ALU_EN=DISABLED,

    MEM_RD, RF_WR

    S6 F MEM[RS] RD ALU_OP= PASS RD,ALU_EN,

    MEM_WR

    To simplify the control unit, we need to analyze the control signals

    carefully. There are three signals that are functions of the opcode only

    and not of the current state. These signals are: ALU_MUX, SWAP, and

    SXT. The ALU_MUX signal is selected as the value of the most

  • 7/31/2019 An Educational Processor-A Design Approach_298

    24/40

    24 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Opcode

    5bits

    Z

    C

    S

    SXT

    EPROM

    (2 ICs)

    NEXT_STATE

    Next State (3 bits)

    ALU_MUX

    RF_WR

    FLAG WR

    IR_WR Also RD/RS-Select

    PCS

    MAR WR

    ALU_EN

    MEM WR

    ALU OP (4-bits)

    State Register

    (3 Flip flops)

    Total EPROM Outputs = 16 outputs

    SWAP

    Bit 4(MSB) of

    Opcode

    MEM_RD =ALU_EN

    significant bit in the opcode (see Table 1 for opcode assignment). The

    SWAP and SXT signals are generated along with the other control

    signals as state-dependent signals. Generating these two signals along

    with the state dependent signals add no cost to the required hardware.Since, (see figure 8), we have 14 bits (7 state-dependent control signals,

    4-bit ALU operation and 3-bit next state value) we need two EPROM

    ICs. Thus, the two signals SWAP and SXT can be included at no cost.

    Tables 6 describes how SXT and SWAP are determined along with an

    instruction group. Based on current state and the instruction group other

    control signals are generated as shown in Table 7. Note that the

    ALU_MUX signals is taken directly form the opcode (see Figure 8).

    Figure (8): The Control Unit

    Clock

  • 7/31/2019 An Educational Processor-A Design Approach_298

    25/40

    Raed Alqadi, & Luai Malhis 25

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Table (6): State Independent Control Signals and Group code.

    No.

    Inputs Outputs

    GroupOpcode S Z C SWAP SXT

    1 Group A - - - 0 0 A

    2 Group B1 - - - 0 0 B

    3 Group B2 1 0 B

    4 Group C - - - 0 0 C

    5 Group E - - - 0 0 E

    6 Group F - - - 0 0 F

    7 JMP - - - 0 1 D

    8 JS 0 - - 0 1 G

    9 JS 1 - - 0 1 D

    10 JNS 1 - - 0 1 G

    11 JNS 0 - - 0 1 D

    12 JZ - 0 - 0 1 G

    13 JZ - 1 - 0 1 D

    14 JNZ - 1 - 0 1 G

    15 JNZ - 0 - 0 1 D

    16 JC - - 0 0 1 G

    17 JC - - 1 0 1 D

    18 JNC - 1 0 1 G

    19 JNC - 0 0 1 D

    F (opcode) = 0 if operation requires select Literal, F(opcode) = 1 if operation

    requires RS

  • 7/31/2019 An Educational Processor-A Design Approach_298

    26/40

    26 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Table (7): Generating Next State and State-dependent Signals

    Inputs Outputs

    Present State OPCODE

    (Group)

    Next State Other Control Signals

    S0 - S1 S0 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S1 - S2 S1 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S2 A,B,C,D S3 S2 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S2 E,F S4 S2 Signals as in Table 5, SWAP

    & SXT as in Table 6S2 G S0 S2 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S3 A S0 S31

    Signals as in Table 5, SWAP

    & SXT as in Table 6

    S3 B S0 S32

    Signals as in Table 5, SWAP

    & SXT as in Table 6

    S3 C S0 S33 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S3 D S0 S34

    Signals as in Table 5, SWAP

    & SXT as in Table 6

    S4 E S5 S4 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S4 F S6 S4 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S5 - S0 S5 Signals as in Table 5, SWAP

    & SXT as in Table 6

    S6 - S0 S6 Signals as in Table 5, SWAP

    & SXT as in Table 6

  • 7/31/2019 An Educational Processor-A Design Approach_298

    27/40

    Raed Alqadi, & Luai Malhis 27

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    4.2 Generating the Control Hardware of EPU

    Based on the above explanation, we will present an algorithm for

    generating a controller implemented from EPROMs /EEPROMs.

    Although the algorithm presented is specific to the state diagram shown

    in Figure 7, it can be easily modified for any state diagram. In our

    implementation we need two ROM ICs (ROM1 andROM2).

    Algorithm Generate Controller

    Note:is concatenate operation

    Open ROM1_file and ROM2_file

    Address = inputs = [OPCODE, Z,C, S, and PRESENT _ STATE =

    STATE _ REGISTER]

    Let O1 =[SWAP,SXT]; // 2 bits

    Let O2 = [ALU_OP,IR_WR,RF_WR]; // 6 -bits

    Let O3=[ALU_EN,PCS, MAR_WR, FLAG_WR, MEM_WR]; // 5bits and

    Let GenerateTable5 (Si,GROUP) be a function that sets O2 and O3 as

    shown in the corresponding table row in Table 5,where Si is a state //

    For address = 0 to 2(No of Inputs)

    Extract OPCODE, PRESENT_STATE from addressSet GROUP = Table6_Group (opcode,Z,C,S);// Group output as a

    function of opcode in Table6

    Set O1=[SWAP, SXT] = Table6_SWAP_SXT(opcode); // SWAP and

    SXT can as function of opcode,Z,C,S in Table 6

    [O2:O3] = GenerateTable5(PRESENT_STATE, GROUP)

    IF (PRESENT_STATE) NOT in (S0, S1, S6) // invalid state

    ROM1 = 0, ROM2 =0 // That is, send to initial state 0

  • 7/31/2019 An Educational Processor-A Design Approach_298

    28/40

    28 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    ELSE {

    IF (PRESENT_STATE == S0 ) NEXT_STATE = S1

    ELSE IF (PRESENT_STATE == S1) NEXT_STATE= S2

    ELSE IF( PRESENT_STATE in (S3,S5,S6) NEXT_STATE = S0

    ELSE IF (PRESENT _ STATE= S4 && GROUP = E) NEXT _

    STATE = S5

    ELSE IF (PRESENT_STATE= S4 && GROUP= F) NEXT _

    STATE = S6

    ELSE IF (PRESENT_STATE== S2){

    SWITCH (GROUP){

    CASES A, B, C, D: NEXT_STATE = S3

    CASES E, F: NEXT_STATE = S4

    CASE G: NEXT_STATE = S0

    }// End Switch

    }// End ELSE_IF

    ROM1 = O1:O2 ;// byte to be written to ROM1 file

    ROM2 = O3:NEXT_STATE ,// byte to be written to ROM2 file

    }// end ELSE

    Write ROM1 to ROM1_File and ROM2 to ROM2_File

    End for

  • 7/31/2019 An Educational Processor-A Design Approach_298

    29/40

    Raed Alqadi, & Luai Malhis 29

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    T0 T1 T2T3

    MEM_RD

    IR_WR

    Active low

    ALU_ENActive low

    RF_WR

    IR_WR_CLK

    MAR_W

    MAR_WR_CLK

    RF_WR_CLK

    Clock

    ALU = Pass PC ALU = Dont Care ALU = Inc PC

    PC select = 1 PC select = 0

    ALU = OP

    PC written here RD written here

    OP RD,RS or OP RD, L

    STORE RS Clock

    (In Register File)

    Figure (9): Timing Diagram for Group A

  • 7/31/2019 An Educational Processor-A Design Approach_298

    30/40

    30 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    4.3 EPU Timing Requirements

    Before implementing the control, the timing diagram for each group

    of instructions shown in Table 4 must be drawn. This process is

    necessary to ensure that there are no glitches and the data is written at the

    correct triggering edge to guarantee the setup and hold times for the

    registers and memory. Also, attention should be given to edge- triggered

    and level-triggered components, for example memory read/write signals

    are level-triggered active low signals. Also, since the ALU bus and the

    RAM bus are shared the ALU output and the RAM output are shared,

    therefore they cannot be enabled together.

    Figure (10): Timing Diagram for Load Operation

    T0 T1 T2 T3

    RAM RD

    IR WR

    Active low

    ALU EN Active low

    RF WR

    IR WR CL

    RF WR CL

    Cloc

    ALU = Pass ALU = Dont ALU = Inc

    PCselect = 1 PC select

    ALU = Pass

    PC written RD written

    T4

    RAM RD

    LW RD, [RS]

    STORE RS Clock

    (In Register File)

    MAR = RS

  • 7/31/2019 An Educational Processor-A Design Approach_298

    31/40

    Raed Alqadi, & Luai Malhis 31

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    To guarantee timing requirements for correct EPU operations, two

    approaches are possible. In the first approach, all clock signals are

    generated by the controller, this is an easy approach, but it requires extra

    states. In the second approach, the state register is triggered by negativeedge and the register clock signals (IR_WR, MAR_WR and FLAG_WR)

    are generated by ANDing the corresponding write signal with the clock.

    Note that this will generate a glitch free write signal, however the

    opposite will not. This timing step may result in modifying the state

    design step and an iterative process is necessary to refine the design. To

    illustrate the timing diagram, we will present the timing diagram for

    Group A, Figure 9, (ALU operations) and Group E, Figure 10, (Load

    Operation). Other group timings have been omitted for space

    considerations

    5. Monitor Program and Interfacing to PC

    In this section, we will describe a monitor program (simple operating

    system) used to download and execute user programs. The monitor

    program is stored in an EPROM while user programs are downloaded

    into RAM. The monitor program allows the EPU to be interfaced to a PC

    through serial port, and hence, can communicate with an interface

    program written by the authors. The user can download programs to the

    EPU RAM and execute the downloaded programs. The programs must

    first be written in assembly using the EPU instruction set and then

    assembled by the assembler developed by the authors. The assembler

    generates specially formatted binary files which the monitor downloadsto RAM. The binary file is composed of frames (records) of the

    following format:

    SOF FT AH-AL COUNT HB-LB HB-LB HB-LB CSH-CSL

    Where:

    SOF: Start of frame, one byte, we use :.

    FT: Frame Type; 00:Regular Frame, FF: Last Frame, one byte

  • 7/31/2019 An Educational Processor-A Design Approach_298

    32/40

    32 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    AH-AL: High and Low bytes of address at which the instructions in

    the record start.

    COUNT: Number of Instruction words in the frame usually 16. This is

    a one byte.

    HB-LB : High byte and Low byte of an instruction.

    CSH-CSL: High and Low bytes of Checksum.

    Figure (11): Interfacing the EPU to a Personal Computer

    Address

    decoding

    logic

    Address

    Bus

    CS

    UAR

    T

    8251

    e.g. 8-input NAND to set at address FF00H

    A0 : bit 0 of

    Address bus

    MEM_WR

    MEM_RD

    FROM the EPU

    D0D7

    Low Byte of

    Data Bus

    RS 232

    Driver

    (e.g. MAX232)

    To Serial Port of the

    Personal Computer

    (RX, TX and GND lines

    are required only)

  • 7/31/2019 An Educational Processor-A Design Approach_298

    33/40

    Raed Alqadi, & Luai Malhis 33

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Before describing the monitor program, we need to describe the

    required hardware in order to interface the EPU with the personal

    computer. Since we are using the serial port, then a UART will be

    required which can be implemented either by hardware or by software.Here, we will describe the interface that uses the hardware UART

    8251(8). Implementing the UART using software is fairly simple and we

    recommend using it if a hardwired UART is not available. Figure 11

    shows the interface circuit. The UART code will reside at addresses

    FF00H and FF01H of the EPU memory as shown in Figure 12.

    Figure (12): EPU Memory Divisions

    Monitor

    EPROM

    Stack

    UART

    User

    Program

    0000H

    FFFFH

    FF00H

  • 7/31/2019 An Educational Processor-A Design Approach_298

    34/40

    34 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Monitor Algorithm:

    The monitor starts at address 0000H as shown in Figure 12 and will

    be executed when the system is reset. We will describe the monitor

    algorithm by using pseudo code; however some subroutines are presented

    in detailed assembly while some details are eliminated for simplicity of

    presentation. The monitor executes the following commands P to

    download program to the EPU, E to execute a loaded program and M

    to display a memory block. The pseudo code of the monitor algorithm is

    given below. First the main program is described followed by the

    subroutines.

    Main Monitor Program

    Initialize stack pointer: // Li R6, Stack-Start

    Main-Loop //Pseudo Code

    R5 = Call Read-Serial; // Read command sent from PC

    ; // This is a blocking read;

    Switch (R5) {

    Case P: Send Ack-Byte; // Sends a R for ready

    Call Receive-Download; // Receive user program.

    Case M: Call Display-Memory; / Display a block of memoryCase E: Jmp Execute-Program; // Execute User Program

    Default: // Otherwise ignore

    End Switch

    Jmp Main-Loop; // loop forever

    End Main

  • 7/31/2019 An Educational Processor-A Design Approach_298

    35/40

    Raed Alqadi, & Luai Malhis 35

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Receive-Download: //Pseudo Code

    // Receives the user program file composed of frames with format

    described above.

    Next-Frame:

    R5= Read-Serial // Get Start of Frame

    If (R5 != SOF) Send-Error and return; // Not start of frame

    character so

    R5= Read-Serial // Get Frame Type

    If (R5 == FF) return; // if Last Frame (FF) then stop.

    R3 = Read-Word // Get Start Address of instructions in R3

    R2 = Read-Serial; // Get Number of Instructions in R2R1=0; // Initialize Computed Check Sum =0h

    While (R2 != 0){ // Read all Instruction words, R2 is Counter

    R5 = Read-Word // Read Instruction

    MOV [R3], R5 // Store Instruction at address in R3

    R3 = R3+1; // Increment Instruction address pointer

    R2= R2 -1 // Decrement Instruction Count by 1

    R1 = R1 + R5; // Add read word to computed checksum,

    }// End While

    R5 =Read-Word; // Get Checksum

    If (R5 != R1) Send Error; // if received and computed checksums are

    // not equal send error.

    JmpNext-Frame; // get Next frame

    End Receive-Download

  • 7/31/2019 An Educational Processor-A Design Approach_298

    36/40

    36 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Display-Memory //Pseudo Code

    This subroutine displays a block of memory specified by the starting

    and end address.

    Expects Starting address and end address

    R2 = Read-Word // Get block Start address and store in R2

    R3 = Read-Word // Get End address in R3

    While (R2 != R3){ // While address not equal to End address

    LW R5,[R2] // Read word from memory in R5

    Send-Word // Send the word in R5 to serial port

    R2 = R2 +1 // Increment address

    }

    End Display-Memory

    Read-Serial: //Assembly code

    // Receives a byte from UART

    LI R4, UART // UART status register address in FF00

    WAIT: MOV R5, [R4]; // Read Status Register

    ANDL R5, 01 // Is receiver empty?

    JZ WAIT // If empty wait.

    ADDL R4, 1; // Address of data register

    MOV R5, [R4]; // Read data register, received byte

    Li R4, 00FF // Mask off upper byte.

    ANDL R5, R4 // Received byte in lower byte of R5

    RETURN; // Result in lower byte of R5, high byte is zero

    End Read-Serial

  • 7/31/2019 An Educational Processor-A Design Approach_298

    37/40

    Raed Alqadi, & Luai Malhis 37

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    Read-Word: //Assembly code

    Reads a word from the UART and stores result in R5

    CALL Read-Serial ; //High byte is read and stored in Low byte of R5

    MOV R0, R5 //High byte is now in low byte R0

    SWAP R0, R0; // Now in High byte of R0

    CALL Read-Serial; //Low byte is read and is stored in Low

    byte of R5

    OR R5,R0 // Word is in R5

    RETURN // Result is in R5

    End Read-Word

    Send-Serial ///Assembly code

    // Sends the lower byte in R5 serially by using the UART

    Li R4, UART // Address of UART Status register

    N_RDY: MOV R0,[R4]; // Read Status register

    ANDL R0, 02 ; // Is transmitter empty

    JNZ N_RDY //If not empty wait

    ADDL R4, 1; // Address of data register

    MOV [R4], R5 // Send Byte

    RETURN

    End Send-Serial

    Send-Word: ///Assembly code

    // Sends the word in R5 serially by using the UART

    SWAP R5,R5 // To send the high byte first

    CALL Send-Serial //Send High Byte

    SWAP R5,R5 //To Send Low Byte

    CALL Send-Serial // Send Low byte

    RETURN

    End Send-Word

  • 7/31/2019 An Educational Processor-A Design Approach_298

    38/40

    38 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    Execute-Program: //Pseudo Code

    Executes the user program, it expects the starting address of the program

    R5 = Read-Word; //Read the starting address of the program in R5MOV PC ,R5

    RETURN // Jump to starting address of user program

    End Execute-Program

    6. Conclusion

    In this paper we described an educational processor along with a

    methodology to develop it. The main objective behind this processor is to

    demonstrate that general- purpose processor can be implemented using

    simple off-the-shelf components. In this project, we discussed all aspectsof processor design staring with instruction set definition which is based

    on RISC philosophy. Then, we discussed the design and implementation

    phase which is driven by the type of components available. Third, we

    showed how control signals are specified and the importance of their

    timings. Finally, we discussed a tool which serially interfaces the EPU

    with a PC. This tool includes a monitor program that resides in the EPU

    memory and is used to transfer data/programs between the EPU and the

    PC and to execute loaded programs.

    The methodology presented can be easily adopted by computer

    engineering (science) departments at universities in developing countries

    to supplement a course in computer architecture or in processor design.

    This methodology is used by students enrolled in the computer

    architecture course at An-Najah N. University to build a complete and

    general-purpose processor. The students enjoyed and benefit greatly from

    the experience they acquired from this project at a minimal cost, and with

    no special hardware resources. They are able to define, design,

    implement, and debug a complete processor. Finally, we recommend

    adopting this methodology by universities in developing countries.

  • 7/31/2019 An Educational Processor-A Design Approach_298

    39/40

    Raed Alqadi, & Luai Malhis 39

    An - Najah Univ. J. Res. (N.Sc.) Vol. 20, 2006

    References(1) M. Buehler, U. G. Baitinger ,"HDL-Based Development of a 32-Bit

    Pipelined RISC Processor for Educational Purposes," 9th

    Mediterranean Electrotechnical Conference (Melecon'98), Tel-Aviv

    Israel, may (1998), 138-142.

    (2)N. L. V. Calazans, F. G. Moraes, C. A. Marcon, "Teaching ComputerOrganization and Architecture With Hands-ob Experience," 32

    nd

    ASEE/IEEE Frontiers in Education Conference, Boston MA, Nov.

    6-9,(2002), T2F15-T2F20.

    (3) H. Diab, I. Demaskieh, "A Computer Aided Teaching Package ofMicroprocessor Systems Education," IEEE Transaction on

    Education, May (1991), 34(2), 179-183,.

    (4) J. Djordjevic, A. Milenkovic, N. Grbanovic, and M. Bojovic, "AnEducational Environment for Teaching a Course in ComputerArchitecture and Organization",IEEE TCCA Newsletter, July

    (1999), 5-7.

    (5) J. Gray,"Hands-on Computer ArchitectureTeaching Processor andIntegrated Systems Design with FPGAs", Workshop on Computer

    Architecture Education, June (2000), Vancouver BC.

    (6) R. D. Hersch, "Integrated Theory and Practice in MicroprocessorSystem Design Course", Proc. Euromicro, North Holland,

    Amsterdam (1984), 227-232.

    (7) Intel,"MCS 51 Microcontroller Family User Manual,"www.intel.com/design/mcs51/manuals/272383.htm

    (8) V. Milutonovic ,"Surviving the Design of a 200 MHz RICSMicroprocessor: Lessons Learned," IEEE Computer Society Press,

    (1997).

    (9) Patterson & Hennessy, Computer Architecture: A QuantitativeApproach, Morgan Kaufmann, (1990).

  • 7/31/2019 An Educational Processor-A Design Approach_298

    40/40

    40 An Educational Processor: A Design Approach

    An - Najah Univ. J. Res. (N. Sc.) Vol. 20, 2006

    (10)B, J. Rodriguez, "A minimal TTL processor for architectureexploration", Proceedings of the 1994 ACM Symposium on Applied

    Computing, Phoenix Arizona, (1994), 338-340.