Ece Viii Embedded System Design [06ec82] Notes

download Ece Viii Embedded System Design [06ec82] Notes

of 270

Transcript of Ece Viii Embedded System Design [06ec82] Notes

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    1/270

    Embedded System Design 06EC82

    ECE, SJBIT

    Sub: Embedded System Design Sub code: 06EC82

    Sem: VIII

    PART A

    UNIT- 1INTRODUCTION: Overview of embedded systems, embedded system design challenges,

    common design metrics and optimizing them. Survey of different embedded system design

    technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose

    processor 4 Hours

    UNIT2

    SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT

    level Combinational and Sequential Components, Optimizing single-purpose processors. Single-

    Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development

    Environment, ASIPS. 6 Hours

    UNIT3

    Standard Single-Purpose Peripherals, Timers, Counters, UART, PWM, LCD Controllers,

    Keypad controllers, Stepper Motor Controller, A to D Converters, Examples. 6 Hours

    UNIT4

    MEMORY: Introduction, Common memory Types, Compulsory memory, Memory Hierarchy

    and Cache, Advanced RAM. Interfacing, Communication Basics, Microprocessor Interfacing,

    Arbitration, Advanced Communication Principles, Protocols - Serial, Parallel 8 Hours

    PART - B

    UNIT - 5

    INTERRUPTS: Basics - Shared Data Problem - Interrupt latency. Survey of Software

    Architecture, Round Robin, Round Robin with Interrupts - Function Queues - scheduling -

    RTOS architecture. 8 Hours

    UNIT6

    INTRODUCTION TO RTOS:MORE OS SERVICES: Tasks - states - Data - Semaphoresand shared data. More operating systems services - Massage Queues - Mail Boxes -Timers Events - Memory Management. 8 Hours

    UNIT7 & 8

    Basic Design Using RTOS:Principles- An example, Encapsulating semaphores and Queues.

    Hard real-time scheduling considerationsSaving Memory space and power. Hardware

    software co-design aspects in embedded systems. 12 Hours

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    2/270

    Embedded System Design 06EC82

    ECE, SJBIT

    INDEX SHEET

    SL.NO TOPIC PAGE NO.

    UNIT - 1

    INTRODUCTION: Overview of embedded systems8 to 32

    01Embedded systems overview

    8 to 9

    02 Design challenges, common design metrics 9 to 12

    03

    Processor technology 13 to 15

    04IC technology 16 to 19

    05Design Technology 19 to 20

    06Tradeoffs 21 to 22

    07 Recommended questions and solutions23 to 31

    UNIT - 2

    CUSTOM SINGLE-PURPOSE PROCESSORS

    33 to 72

    01

    HARDWARE:

    Introduction, combinational logic

    32 to 36

    02Sequential logic 36 to 38

    03 Custom single purpose processor design 39 to 40

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    3/270

    Embedded System Design 06EC82

    ECE, SJBIT

    04RT level processor design 40 to 42

    05Optimizing custom processors 42 to 44

    06

    SOFTWARE:

    Basic architecture45 to 49

    07operation 50 to 51

    08 Programmers view 51 to 55

    09 Development environment55 to 57

    10 ASIPs57 to 60

    11 Recommended questions and solutions61 to 71

    UNIT - 3

    Standard Single Purpose Processors : Peripherals73 to 88

    01Introduction, timers, counters watchdog timers 73 to 74

    02UART,PWM, 75 to 76

    03LCD controllers ,Stepper Motor controllers 77 to 79

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    4/270

    Embedded System Design 06EC82

    ECE, SJBIT

    04Analog to Digital converters ,RTC 80 to 81

    05 Recommended questions and solutions82 to 87

    UNIT - 4

    Memory and Microprocessor interfacing89 to 153

    01Intro, Memory write ability 89 to 91

    02Common memory types 92 to 98

    03Composing memory 98 to 99

    04Memory hierarchy and cache 99 to 105

    05 Advanced RAM 105 to 108

    06Communication basics 109 to 113

    07Microprocessor interfacing 114 to 121

    08Arbitration 122 to 125

    09Multilevel Bus architectures 125 to 126

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    5/270

    Embedded System Design 06EC82

    ECE, SJBIT

    10Advanced communication principles 126 to 132

    11Recommended questions and solutions 133 to 152

    UNIT - 5

    INTERRUPTS and Survey of software architecture154 to 174

    01Shared Date problem 154 to 157

    02Round robin 157 to 161

    03Function queues 161 to 162

    04RTOS architecture 162 to 166

    05 Recommended questions and solutions 167 to 174

    UNIT - 6

    INTRODUCTION TO RTOS , MORE ON OS SERVICES175 to 230

    01Tasks , states data 175 to 183

    02Semaphores 184 to 195

    03Messages queues, mail boxes 195 to 209

    04 Events , memory management209 to 219

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    6/270

    Embedded System Design 06EC82

    ECE, SJBIT

    05 Recommended questions and solutions220 to 228

    UNIT

    7 & 8

    BASIC DESIGN USING RTOS

    231 to 270

    01Principles 230 to 234

    02Encapsulating semaphores 234 to 258

    03Hard real time scheduling considerations 258 to 258

    04Saving memory and power 258 to 260

    05 Recommended questions and solutions261 to 269

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    7/270

    Embedded System Design 06EC82

    ECE, SJBIT

    PART A

    UNIT- 1

    INTRODUCTION: Overview of embedded systems, embedded system design challenges,

    common design metrics and optimizing them. Survey of different embedded system design

    technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose

    processors.

    4 Hours

    TEXT BOOKS:

    1. Embedded System Design: A Unified Hardware/Software Introduction - Frank

    Vahid, Tony Givargis, John Wiley & Sons, Inc.2002

    2. An Embedded software Primer - David E. Simon: Pearson Education, 1999

    REFERENCE BOOKS:

    1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008

    2. Embedded Systems Architecture A Comprehensive Guide for Engineers and

    Programmers, Tammy Noergaard, Elsevier Publication, 2005

    3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    8/270

    Embedded System Design 06EC82

    ECE, SJBIT

    EMBEDED SYSTEM DESIGN

    UNIT 1

    INTRODUCTION1.1Embedded systems overview

    1.2Design Challenges

    1.3Processor Technology

    1.4IC Technology

    1.5Design Technology

    1.1. Embedded systems overview

    An embedded system is nearly any computing system other than a desktop computer. Anembedded system is a dedicated system which performs the desired function upon power up,repeatedly.

    Embedded systems are found in a variety of common electronic devices such as consumerelectronics ex. Cell phones, pagers, digital cameras, VCD players, portable Video games,calculators, etc.,

    Embedded systems are found in a variety of common electronic devices, such as:(a)consumer electronics -- cell phones, pagers, digital cameras, camcorders, videocassetterecorders, portable video games, calculators, and personal digital assistants; (b) home appliances-- microwave ovens, answering machines, thermostat, home security, washing machines, andlighting systems; (c) office automation -- fax machines, copiers, printers, and scanners; (d)business equipment -- cash registers, curbside check-in, alarm systems, card readers, productscanners, and automated teller machines; (e) automobiles --transmission control, cruise control,fuel injection, anti-lock brakes, and active suspension.

    Common characteristics of Embedded systems :

    Embedded systems have several common characteristics that distinguish such system from othercomputing systems;

    1. Single functioned :An Embedded system executes a single program repeatedly. Theentire program is executed in a loop over and over again.

    2. Tightly coupled (constrained):It should cost less, perform fast enough to process data inreal time, must fit in a single chip, consume as much less power as possible, etc.

    3. Reactive and real time: Embedded Systems should continuously react to changes in theenvironment. It should also process and compute data in real time without delay.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    9/270

    Embedded System Design 06EC82

    ECE, SJBIT

    Fig 1.1 An embedded system example a digital camera

    1.2 Design challenge

    Design matrics:

    A Design metric is a measure of implementations features such as cost, size, performance andpower. Embedded system

    - must cost less

    - must be sized to fit on a single chip.

    - must perform in real time (response time)

    - must consume minimum power

    The embedded system designer must be designed to meet the desired functionality. Apartmeeting the functionality, should also consider optimizing numerous design metrics.

    common design metrics that a design engineer should consider :

    - NRE( non recurring engineering Cost) : The one time monetary cost of designing thesystem.

    - Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    10/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    - Size: physical space required by the system. Often measured in terms of bytes in case ofsoftware, and no. of gates in terms of hardware.

    - Performance: execution/response time of the system.

    - Power: The amount of power consumed by the system, which may define lifetime of the

    battery and cooling requirement of IC. More power means more heat.

    - Flexibility: ability to change the functionality of the system.

    - Time to prototype: time needed to build a working system w/o incurring heavy NRE.

    - Time to market: time required to develop & released to the market.

    - Maintainability: ability to modify the system after its release to the market.

    - Correctness: our confidence that we have implemented systems functionality correctly.

    - Safety: probability that the system does not cause any harm.

    Metrics typically compete with one another: improving one often leads to worsening ofanother

    Fig : 1.2 Design metric competition

    1.2.1 Time to Market Design Metric :

    - The time to market: Introducing an embedded system early to the market can make bigdifference in terms of systems profitability. Market windows generally will be very

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    11/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    narrow, often in the order of few months. Missing this window can mean significant lossin sales.

    Fig 1.3 Time to Market

    (A) Market window (B) simplified revenue model for computing revenue loss

    Lets investigate the loss of revenue that can occur due to delayed entry of a product in themarket. We can use a simple triangle model y axis is the market rise, x axis to represent the pointof entry to the market. The revenue for an on time market entry is the area of the trianglelabeled on time and the revenue for a delayed entry product is the area of the triangle labeledDelayed. The revenue loss for a delayed entry is the difference of these triangles areas.

    % revenue loss = ((on timeDelayed)/on time)*100 %

    The area of on time triangle = * base * height

    W -- height the market raise

    D -- Delayed entry ( in terms of weeks or months )

    2Wproducts life time

    Area of on time triangle = *2W*W

    Area of delayed triangle=1/2*(W-D+W)*(W-D)

    %age revenue loss = (D (3W- D)/2W*W) * 100 %

    Ex: products life time is 52 weeks

    Delay of entry to the market is 4 weeks

    Percentage revenue loss = 22%

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    12/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    1.2.2 The NRE and Unit cost Design metrics:

    Unlike other design metric the best technology choice will depend on the no of units.

    Tech. A would result in NRE cost $2000 unit cost $100

    B $30000 $30

    C 100000 $2

    Total cost= NRE cost + unit cost* no of units

    Per product cost = total cost/no of units

    = NRE cost/no of units + unit cost

    1.2.3 The performance Design metric:

    Performance of a system is a measure of how long the system takes to execute our desiredtasks. There are several measures of performance. The two main measures are

    Latency or response time

    Throughput : no of tasks that are processed in unit

    speed up is a method of comparing performance of two systems

    Speed up of A over B = performance of A/performance of B.

    Technologies used in embedded systems:

    Technology is a manner of accomplishing a task. There are three types of technologies arecentral to embedded system design:

    Processor technologies

    IC technologies

    Design technologies

    Processor technology: relates to architecture of the computation engine use to implement asystems desired functionalities. Generally the term processor is associated with programmable

    software processors. But many non programmable digital systems can be thought of asprocessors.

    Single purpose processors: is a digital system designed to execute exactly only one function.Performance may be good, flexibility may be poor.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    13/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    Application specific processor: may serve as a compromise between single purpose and generalpurpose. An ASIP is a programmable processor optimized for particular class of applicationshaving common characteristics, such as embedded control, digital signal processing, ortelecommunications. This provides flexibility, while achieving good performance, low powerand size.

    General purpose processors: The designer of a general purpose or microprocessor, builds aprogrammable device that is suitable for a variety to maximize the sale.

    Design considerations

    Should accommodate different kinds of program

    Should provide general data path to handle variety ofcomputations

    Design technology: design technology involves converting our concepts of desiredfunctionalities into an implementation. Design implementations should optimize design metrics

    and should also realize faster.

    Variations of top down design process have become popular

    1.3.1 Processor Technologies:

    1. General Purpose ProcessorsSoftware

    2. Single Purpose ProcessorsHardware

    3.Application Specific Processors:Application specific Instruction set processors (ASIP)

    1. General Purpose Processors Software

    They are programmable devices used in a variety of applications. They are also known asmicroprocessors. They have a program memory and a general data path with a large register file

    and general ALU. The data path must be large enough to handle a variety ofcomputations. The programmer writes the program to carry out the required functionalityin the program memory and uses the features (instructions) provided by the general data

    path. This is called as thesoftware portion of the system. The benefits of such a processorare very high. They require Low time-to-market and have low NRE costs. They provide ahigh flexibility.

    Design time andNRE costare low, because the designer must only write a program, but need notdo any digital design. Flexibility is high, because changing functionality requires only changingthe program. Unit cost may be relatively low in small quantities, since the processormanufacturer sells large quantities to other customers and hence distributes the NRE cost over

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    14/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    many units. Performance may be fast for computation-intensive applications, if using a fastprocessor, due to advanced architecture features and leading edge IC technology.

    some design-metric drawbacks : Unit costmay be too high for large quantities. Performancemay be slow for certain applications. Size andpowermay be large due to unnecessary processor

    hardware.Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,representing an exact fit of the desired functionality, nothing more, nothing less.

    Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-

    purpose processor, (b) application-specific processor, (c)single-purpose processor.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    15/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    Fig 1.5 Implementing desired functionality on different General purpose processor

    2. Single Purpose Processors Hardware:

    This is a digital circuit designed to execute exactly one program. Its features are, it contains onlythe components needed to execute a single program; it contains no program memory. Usercannot change the functionality of the chip. They are fast, low powered and small sized.

    An embedded system designer creates a single-purpose processor by designing a custom digitalcircuit. Using a single-purpose processor in an embedded system results in several design metricbenefits and drawbacks, which are essentially the inverse of those for general purposeprocessors. Performance may be fast, size and power may be small, and unit-cost may be low for

    large quantities, while design time and NRE costs may be high, flexibility is low, unit cost maybe high for small quantities, and performance may not match general-purpose processors forsome applications.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    16/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    Fig 1.6 Implementing desired functionality on different single purpose processor

    3.Application Specific Processors:Application specific Instruction set processors (ASIP):

    They are programmable processors optimized for a particular class of applications havingcommon characteristics. They strike a compromise between general-purpose and single-purposeprocessors. They have a program memory, an optimized data path and special functional units.They have good performance, some flexibility, size and power.

    An application-specific instruction-set processor (or ASIP) can serve as a compromise betweenthe above processor options. An ASIP is designed for a particular class of applications withcommon characteristics, such as digital-signal processing, telecommunications, embeddedcontrol, etc. The designer of such a processor can optimize the datapath for the application class,

    perhaps adding special functional units for common operations, and eliminating otherinfrequently used units.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    17/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    Fig 1.7 Implementing desired functionality on different Application Specific processor

    Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. ADSP is a processor designed to perform common operations on digital signals, which are thedigital encodings of analog signals like video and audio. These operations carry out commonsignal processing tasks like signal filtering, transformation,or combination. Such operations areusually math-intensive, including operations like multiply and add or shift and add. To supportsuch operations, a DSP may have special purpose datapath components such a multiply-accumulate unit, which can perform a computation like T = T + M[i]*k using only one

    instruction. Because DSP programs often manipulate large arrays of data, a DSP may alsoinclude special hardware to fetch sequential data memory locations in parallel with otheroperations, to further speed execution.

    Highlight merits and demerits of single purpose processors and general-purpose processors.

    Single Purpose Processors:

    Merits:

    1. They are fast

    2. They consume low power

    3. They have small size

    4. Unit cost may be low for large quantities

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    18/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    Demerits:

    1. NRE costs may be high

    2. Low flexibility

    3. Unit cost high for small quantities

    4. Performance may not match for some applications

    General Purpose Processors:

    Merits:

    1. High Flexibility

    2. Low NRE costs

    3. Low time to market

    4. Performance may be for fast and high-intensive computations.

    De-Merits:

    1. Unit cost may be relatively high for large quantities.

    2. Performance may be slower for certain applications.

    3. Size and power may be large due to unnecessary processor hardware.

    How a single purpose processor is distinctly different from a general-purpose processor?

    Sl.No

    .Single Purpose Processor General Purpose Processor

    1. Executes exactly one program. Executes any program written by the user.

    2.The functionality cannot be changed.

    The functionality can be changed by theuser by writing the required program.

    Sl.No.

    Single Purpose Processor General Purpose Processor

    3. They do not have program memory They have program memory

    4. Do not have any flexibility and containresources required only for that particularfunctionality

    Has a very large amount of resource whichmay or may not be used for a particularfunctionality as decided by the user

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    19/270

    Embedded System Design 06EC82

    ECE, SJBIT

    1

    5. Merits include : They are fast, theyconsume low power, they have small sizeand the unit cost may be low for largequantities

    Merits include : They have high Flexibility,Low NRE costs, Low time to market,Performance may be for fast and high-intensive computations.

    1.4 IC Technology

    Every processor must eventually be implemented on an IC. IC technology involves the mannerin which we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit),often called a chip, is a semiconductor device consisting of a set of connected transistors andother devices. A number of different processes exist to build semiconductors, the most popular ofwhich is CMOS (Complementary Metal Oxide Semiconductor). The IC technologies differ byhow customized the IC is for a particular implementation. IC technology is independent fromprocessor technology; any type of processor can be mapped to any type of IC technology.

    Fig : 1. 8 The independence of processor and IC technologies: any processor technology can be

    mapped to any IC technology.

    To understand the differences among IC technologies, we must first recognize thatsemiconductors consist of numerous layers. The bottom layers form the transistors. The middlelayers form logic gates. The top layers connect these gates with wires. One way to create theselayers is by depositing photo-sensitive chemicals on the chip surface and then shining lightthrough masks to change regions of the chemicals. Thus, the task of building the layers isactually one of designing appropriate masks. A set of masks is often called a layout. Thenarrowest line that we can create on a chip is called the feature size, which today is well belowone micrometer (sub-micron).

    1.4.1 Full-custom/VLSI

    In a full-custom IC technology, we optimize all layers for our particular embedded systemsdigital implementation. Such optimization includes placing the transistors to minimizeinterconnection lengths, sizing the transistors to optimize signal transmissions and routing wiresamong the transistors. Once we complete all the masks, we send the mask specifications to a

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    20/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    fabrication plant that builds the actual ICs. Full-custom IC design, often referred to as VLSI(Very Large Scale Integration) design, has very high NRE cost and long turnaround times(typically months) before the IC becomes available, but can yield excellent performance withsmall size and power. It is usually used only in high-volume or extremely performance-criticalapplications.

    1.4.2 Semi-custom ASIC (gate array and standard cell)

    In an ASIC (Application-Specific IC) technology, the lower layers are fully or partially built,leaving us to finish the upper layers. In a gate array technology, the masks for the transistor andgate levels are already built (i.e., the IC already consists of arrays of gates). The remaining taskis to connect these gates to achieve our particular implementation. In a standard cell technology,logic-level cells (such as an AND gate or an AND-OR-INVERT combination) have their maskportions pre-designed, usually by hand. Thus, the remaining task is to arrange these portions into

    complete masks for the gate level, and then to connect the cells. ASICs are by far the mostpopular IC technology, as they provide for good performance and size, with much less NRE costthan full-custom ICs.

    1.4.3 PLD

    In a PLD (Programmable Logic Device) technology, all layers already exist, so we can purchasethe actual IC. The layers implement a programmable circuit, where programming has a lower-level meaning than a software program. The programming that takes place may consist ofcreating or destroying connections between wires that connect gates, either by blowing a fuse, orsetting a bit in a programmable switch. Small devices, called programmers, connected to a

    desktop computer can typically perform such programming. We can divide PLD's into two types,simple and complex. One type of simple PLD is a PLA (Programmable Logic Array), whichconsists of a programmable array of AND gates and a programmable array of OR gates. Anothertype is a PAL (Programmable Array Logic), which uses just one programmable array to reducethe number of expensive programmable components. One type of complex PLD, growing veryrapidly in popularity over the past decade, is the FPGA (Field Programmable Gate Array), whichoffers more general connectivity among blocks of logic, rather than just arrays of logic as withPLAs and PALs, and are thus able to implement far more complex designs. PLDs offer very lowNRE cost and almost instant IC availability. However, they are typically bigger than ASICs, mayhave higher unit cost, may consume more power, and may be slower (especially FPGAs). Theystill provide reasonable performance, though, so are especially well suited to rapid prototyping.

    1.5 DESIGN TECHNOLOGY:

    Design technology involves the manner in which we convert our concept of desired systemfunctionality into an implementation. We must not only design the implementation to optimisedesign metrics, but we must do so quickly.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    21/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    Variations of a top-down design process have become popular in the past decade, an ideal formof which is illustrated in the figure. The designer refines the system through several abstractionlevels. At the system level the designer describes the desired functionality in an executablelanguage like C. This is called system specification.

    The designer refines this specification by distributing portions of it among several general and/orsingle purpose processors, yielding behavioural specifications for each processor.

    The designer refines these specifications into register-transfer (RT) specifications by convertingbehaviour on general-purpose processors to assembly code, and by converting behaviour onsingle purpose processors to a connection of register-transfer components and state machines.The designer then refines the RT level specification into a logic specification.

    Finally, the designer refines the remaining specifications into an implementation consisting ofmachine code for general purpose processors and a design gate level net list for single purposeprocessors.

    Fig 1.9 : Deal top-down design process, and productivity improvers.

    There are three main approaches to improving the design process for increased productivity,which we label as compilation/synthesis, libraries/IP, and test/verification. Several otherapproaches also exist.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    22/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    1.5.1 Compilation/Synthesis

    Compilation/Synthesis lets a designer specify desired functionality in an abstract manner, andautomatically generates lower-level implementation details. Describing a system at highabstraction levels can improve productivity by reducing the amount of details, often by an order

    of magnitude, that a design must specify.A logic synthesis tool converts Boolean expressions into a connection of logic gates (called anetlist). A register-transfer (RT) synthesis tool converts finite-state machines and register-transfers into a datapath of RT components and a controller of Boolean equations. A behavioralsynthesis tool converts a sequential program into finite-state machines and register transfers.Likewise, a software compiler converts a sequential program to assembly code, which isessentially register-transfer code. Finally, a system synthesis tool converts an abstract systemspecification into a set of sequential programson general and single-purpose processors.The relatively recent maturation of RT and behavioral synthesis tools has enabled a unified viewof the design process for single-purpose and general-purpose processors. Design for the former iscommonly known as hardware design, and design for the latter as software design. In the

    past, the design processes were radically different software designers wrote sequentialprograms, while hardware designers connected components.

    Fig 1.10 The co-design ladder: recent maturation of synthesis enables a unified view

    of hardware and software.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    23/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    1.5.2 Libraries/IP

    Libraries involve re-use of pre-existing implementations. Using libraries of existingimplementations can improve productivity if the time it takes to find, acquire, integrate and test alibrary item is less than that of designing the item oneself. A logic-level library may consist of

    layouts for gates and cells. An RT-level library may consist of layouts for RT components, likeregisters, multiplexors, decoders, and functional units. A behavioral-level library may consist ofcommonly used components, such as compression components, bus interfaces, displaycontrollers, and even general purpose processors. The advent of system-level integration hascaused a great change in this level of library.

    1.5.3 Test/Verification

    Test/Verification involves ensuring that functionality is correct. Such assurance can prevent time-consuming debugging at low abstraction levels and iterating back to high abstraction levels.Simulation is the most common method of testing for correct functionality, although more formal

    verification techniques are growing in popularity. At the logic level, gate level simulatorsprovide output signal timing waveforms given input signal waveforms.Likewise, general-purpose processor simulators execute machine code. At the RT-level,hardware description language (HDL) simulators execute RT-level descriptions and provideoutput waveforms given input waveforms. At the behavioral level, HDL simulators simulatesequential programs, and co-simulators connect HDL and general purpose processor simulatorsto enable hardware/software co-verification. At the system level, a model simulator simulates theinitial system specification using an abstract computation model, independent of any processortechnology, to verify correctness andcompleteness of the specification.

    1.5.4 More productivity improvers

    There are numerous additional approaches to improving designer productivity. Standards focuson developing well-defined methods for specification, synthesis and libraries. Such standards canreduce the problems that arise when a designer uses multiple tools, or retrieves or providesdesign information from or to other designers. Common standards include language standards,synthesis standards and library standards.

    Languages focus on capturing desired functionality with minimum designer effort. For example,the sequential programming language of C is giving way to the object oriented language of C++,which in turn has given some ground to Java. As another example, state-machine languages

    permit direct capture of functionality as a set of states and transitions, which can then betranslated to other languages like C.

    Frameworks provide a software environment for the application of numerous tools throughoutthe design process and management of versions of implementations. For example, a frameworkmight generate the UNIX directories needed for various simulators and synthesis tools,supporting application of those tools through menu selections in a single graphical user interface.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    24/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    RECOMMENDED QUESTIONS

    UNIT 1

    Overview of embedded systems

    1. What is an embedded system? Why is it so hard to define ES?

    2. List and define the three main characteristics of embedded system that

    distinguish such systems from other computing systems.

    3. What is design metric?

    4.

    List a pair of design metrics that may compete with one another providing

    an intuitive explanation of the reason behind it.

    5. What is market window and why is it so important to reach the market

    early in this window?

    6. What is NRE cost?

    7. List and define the three main processor technologies. What are the

    benefits of using different processor technologies.

    8. List the main IC technologies and list out the benefits.

    9. List the three main design technologies and how is it helpful to designers.

    10.Provide a definition of Moores law. 11.Compute annual growth rate of IC capacity and designer productivity.

    12.What is design gap?

    13.What I renaissance engineer and why is it so important in current

    market?

    14.Define what is meant by mythical man month.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    25/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    QUESTION PAPER SOLUTION

    UNIT 1

    Q1.Highlight the merits and demerits of single purpose processors and general-

    purpose processors.

    Single Purpose Processors:

    Merits:

    5. They are fast

    6. They consume low power

    7. They have small size

    8. Unit cost may be low for large quantities

    Demerits:

    5. NRE costs may be high

    6. Low flexibility

    7. Unit cost high for small quantities

    8. Performance may not match for some applications

    General Purpose Processors:

    Merits:

    5. High Flexibility

    6. Low NRE costs

    7. Low time to market

    8. Performance may be for fast and high-intensive computations.

    De-Merits:

    4. Unit cost may be relatively high for large quantities.

    5. Performance may be slower for certain applications.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    26/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    6. Size and power may be large due to unnecessary processor hardware.

    Q2.How a single purpose processor is distinctly different from a general-purpose processor?

    Sl.No. Single Purpose Processor General Purpose Processor

    1. Executes exactly one program. Executes any program written by the user.

    2.The functionality cannot be changed.

    The functionality can be changed by the user by

    writing the required program.

    Sl.No. Single Purpose Processor General Purpose Processor

    3. They do not have program memory They have program memory

    4.Do not have any flexibility and contain resources

    required only for that particular functionality

    Has a very large amount of resource which may or

    may not be used for a particular functionality as

    decided by the user

    5. Merits include : They are fast, they consume low

    power, they have small size and the unit cost may

    be low for large quantities

    Merits include : They have high Flexibility, Low NRE

    costs, Low time to market, Performance may be for

    fast and high-intensive computations.

    Q3. Explain the three Processor Technologies Briefly

    1. General Purpose Processors Software:

    They are programmable devices used in a variety of applications. They are also known as microprocessors.

    They have a program memory and a general data path with a large register file and a general ALU. The

    data path must be large enough to handle a variety of computations. The programmer writes the program

    to carry out the required functionality in the program memory and uses the features (instructions)

    provided by the general data path. This is called as the software portion of the system. The benefits of

    such a processor are very high. They require Low time-to-market and have low NRE costs. They provide a

    high flexibility.

    2. Single Purpose Processors Hardware:

    This is a digital circuit designed to execute exactly one program. Its features are, it contains only the

    components needed to execute a single program; it contains no program memory. User cannot change

    the functionality of the chip. They are fast, low powered and small sized.

    3. Application Specific Processors:Application specific Instruction set processors (ASIP)

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    27/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    They are programmable processors optimized for a particular class of applications having common

    characteristics. They strike a compromise between general-purpose and single-purpose processors. They

    have a program memory, an optimized data path and special functional units. They have good

    performance, some flexibility, size and power.

    4. What are the common design metrics that a design engineer should

    consider?

    - NRE( non recurring engineering Cost) : The one time monetary cost of designing the system.

    - Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.

    - Size: physical space required by the system. Often measured in terms of bytes in case of software, and no.

    of gates in terms of hardware.

    - Performance: execution/response time of the system.

    - Power: The amount of power consumed by the system, which may define lifetime of the battery and

    cooling requirement of IC. More power means more heat.

    - Flexibility: ability to change the functionality of the system.

    - Time to prototype: time needed to build a working system w/o incurring heavy NRE.

    - Time to market: time required to develop & released to the market.

    - Maintainability: ability to modify the system after its release to the market.

    - Correctness: our confidence that we have implemented systems functionality correctly.

    - Safety: probability that the system does not cause any harm.

    Metrics typically compete with one another: improving one often leads to worsening of another

    Q5. Write short notes on IC technology

    Every processor must eventually be implemented on an IC. IC technology involves the manner in which

    we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit), often called a chip,

    is a semiconductor device consisting of a set of connected transistors and other devices. A number of

    different processes exist to build semiconductors, the most popular of which is CMOS (Complementary

    Metal Oxide Semiconductor). The IC technologies differ by how customized the IC is for a particular

    implementation. IC technology is independent from processor technology; any type of processor can be

    mapped to any type of IC technology.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    28/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    The independence of processor and IC technologies: any processor technology can be

    mapped to any IC technology.

    To understand the differences among IC technologies, we must first recognize that semiconductors

    consist of numerous layers. The bottom layers form the transistors. The middle layers form logic gates.

    The top layers connect these gates with wires. One way to create these layers is by depositing photo-sensitive chemicals on the chip surface and then shining light through masks to change regions of the

    chemicals. Thus, the task of building the layers is actually one of designing appropriate masks. A set of

    masks is often called a layout. The narrowest line that we can create on a chip is called thefeature size,

    which today is well below one micrometer (sub-micron). For each IC technology, all layers must

    eventually be built to get a working IC; the question is who builds each layer and when.

    Q6. Derive the equation for percentage loss for any market rise . A

    product was delayed by 4 weeks in releasing to market. The peak

    revenue for on time entry to market would occur after 20 weeks for amarket rise angle by 45. Find the percentage revenue loss.

    Ans : Lets investigate the loss of revenue that can occur due to delayed entry of a product in the

    market. We can use a simple triangle model y axis is the market rise, x axis to represent the point of

    entry to the market. The revenue for an on time market entry is the area of the triangle labeled on

    time and the revenue for a delayed entry product is the area of the triangle labeled Delayed. The

    revenue loss for a delayed entry is the difference of these triangles areas.

    % revenue loss = ((on time Delayed)/on time)*100 %

    The area of on time triangle = * base * height

    W -- height the market raise

    D -- Delayed entry ( in terms of weeks or months )

    2Wproducts life time

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    29/270

    Embedded System Design 06EC82

    ECE, SJBIT

    2

    Area of on time triangle = *2W*W

    Area of delayed triangle=1/2*(W-D+W)*(W-D)

    %age revenue loss = (D (3W- D)/2W*W) * 100 %

    Ex: products life time is 52 weeks

    Delay of entry to the market is 4 weeks

    Percentage revenue loss = 22%

    Q7. Compare GPP,SPP and ASSP along with their block diagrams .1. General Purpose Processors Software

    They are programmable devices used in a variety of applications. They are also known as

    microprocessors. They have a program memory and a general data path with a large register file andgeneral ALU. The data path must be large enough to handle a variety of computations. The

    programmer writes the program to carry out the required functionality in the program memory

    and uses the features (instructions) provided by the general data path. This is called as the

    software portion of the system. The benefits of such a processor are very high. They require Low

    time-to-market and have low NRE costs. They provide a high flexibility.

    Design time and NRE costare low, because the designer must only write a program, but need not do any

    digital design. Flexibilityis high, because changing functionality requires only changing the program. Unit

    costmay be relatively low in small quantities, since the processor manufacturer sells large quantities to

    other customers and hence distributes the NRE cost over many units. Performance may be fast for

    computation-intensive applications, if using a fast processor, due to advanced architecture features and

    leading edge IC technology.

    some design-metric drawbacks : Unit costmay be too high for large quantities. Performance may be

    slow for certain applications. Size andpowermay be large due to unnecessary processor hardware.

    Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,

    representing an exact fit of the desired functionality, nothing more, nothing less.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    30/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-

    purpose processor, (b) application-specific processor, (c)single-purpose processor.

    Fig 1.5 Implementing desired functionality on different General purpose processor

    2. Single Purpose Processors Hardware:

    This is a digital circuit designed to execute exactly one program. Its features are, it contains only the

    components needed to execute a single program; it contains no program memory. User cannot change

    the functionality of the chip. They are fast, low powered and small sized.

    An embedded system designer creates a single-purpose processor by designing a custom digital circuit.Using a single-purpose processor in an embedded system results in several design metric benefits and

    drawbacks, which are essentially the inverse of those for general purpose processors. Performance may

    be fast, size and power may be small, and unit-cost may be low for large quantities, while design time

    and NRE costs may be high, flexibility is low, unit cost may be high for small quantities, and performance

    may not match general-purpose processors for some applications.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    31/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    Fig 1.6 Implementing desired functionality on different single purpose processor

    3.Application Specific Processors:Application specific Instruction set processors (ASIP):

    They are programmable processors optimized for a particular class of applications having common

    characteristics. They strike a compromise between general-purpose and single-purpose processors. They

    have a program memory, an optimized data path and special functional units. They have good

    performance, some flexibility, size and power.

    An application-specific instruction-set processor (or ASIP) can serve as a compromise between the above

    processor options. An ASIP is designed for a particular class of applications with common characteristics,

    such as digital-signal processing, telecommunications, embedded control, etc. The designer of such a

    processor can optimize the datapath for the application class, perhaps adding special functional units for

    common operations, and eliminating other infrequently used units.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    32/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    Fig 1.7 Implementing desired functionality on different Application Specific processor

    Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. A DSP is a

    processor designed to perform common operations on digital signals, which are the digital encodings of

    analog signals like video and audio. These operations carry out common signal processing tasks likesignal filtering, transformation,or combination. Such operations are usually math-intensive, including

    operations like multiply and add or shift and add. To support such operations, a DSP may have special

    purpose datapath components such a multiply-accumulate unit, which can perform a computation like T

    = T + M[i]*k using only one instruction. Because DSP programs often manipulate large arrays of data, a

    DSP may also include special hardware to fetch sequential data memory locations in parallel with other

    operations, to further speed execution.

    Q8. Suggest two methods to improve productivity.

    There are numerous additional approaches to improving designer productivity. Standards focus on

    developing well-defined methods for specification, synthesis and libraries. Such standards can reduce

    the problems that arise when a designer uses multiple tools, or retrieves or provides design information

    from or to other designers. Common standards include language standards, synthesis standards and

    library standards.

    Languages focus on capturing desired functionality with minimum designer effort. For example, the

    sequential programming language of C is giving way to the object oriented language of C++, which in

    turn has given some ground to Java. As another example, state-machine languages permit direct capture

    of functionality as a set of states and transitions, which can then be translated to other languages l ike C.

    Frameworks provide a software environment for the application of numerous tools throughout the

    design process and management of versions of implementations. For example, a framework might

    generate the UNIX directories needed for various simulators and synthesis tools, supporting application

    of those tools through menu selections in a single graphical user interface.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    33/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    UNIT 2

    SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT

    level Combinational and Sequential Components, Optimizing single-purpose processors. Single-

    Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development

    Environment, ASIPS.

    6 Hours

    TEXT BOOKS:

    1. Embedded System Design: A Unified Hardware/Software Introduction - Frank

    Vahid, Tony Givargis, John Wiley & Sons, Inc.2002

    REFERENCE BOOKS:

    1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008

    2. Embedded Systems Architecture A Comprehensive Guide for Engineers and

    Programmers, Tammy Noergaard, Elsevier Publication, 2005

    3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    34/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    UNIT 2

    CUSTOM SINGLE PURPOSE PROCESSORS: HARDWARE

    2.1 INTRODUCITON:

    A processor is a digital circuit designed to perform computation tasks . a processor consists of adatapath capable of storing and manipulating data and a controller capable of moving datathrough the datapath.

    A general purpose processor is designed to carry out a wide variety of computation task.A singlepurpose processor is designed specifically to carry out a particular computational task.

    A custom single-purpose processor may be

    Fast, small, low power But, high NRE, longer time-to-market, less flexible

    2.2 COMBINATIONAL LOGIC:

    1. Transistors and Logic Gates

    2. Basic combinational logic design

    3. RT level combinational components

    Transistors and Logic Gates:

    A transistor is the basic electrical component in digital systems. A transistor acts as

    simple on/off switch. Among the designs CMOS is one .

    Fig 2.1 view of CMOS transistor on silicon

    The CMOS transistor consists of Gate, source and drain , where gate controls the current

    flow from source to drain. The voltage of +3V or +5V can be supply which will refer to

    logic 1 and low voltage is typically ground and treated as logic 0.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    35/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    When logic 1 is applied to gate transistor conducts so current flows

    When logic 0 is applied to gate transistor does not conduct.

    Fig 2.2 a & b CMOS transistor implementation

    Fig 2.2 a b & c CMOS transistor implementation of inverter,NAND and NOR gate

    Digital system designers work at the abstraction level of logic gates where each gate is

    represented symbolically with Boolean equation as shown in figure 2.3

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    36/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    Fig 2.3 Basic logic gates

    Combinational logic design:

    A combinational circuit is a digital circuit whose output is purely a function of its

    present inputs. Such a circuit has no memory of past inputs.example is shown below.

    Fig 2.4 combi design : problem , TT, output , minimized , final ckt.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    37/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    RT level combinational components:

    Design of complex digital circuits takes time using only logic gates , so, combinational

    components like Mux, Decoders,adders ,comparators, ALUetc can be designed used RT

    level synthesis .

    Fig 2.5 combinational components

    2.3 Sequential logic

    a.Flip flops

    b.RT level sequential components

    c. Sequential logic design

    2.3.1 Flip flops

    A sequential circuit is a digital circuit whose outputs are a function of the present as well

    as previous input values. Basic sequential circuits is a flip flop. A flip flop stores a single

    bit.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    38/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    D-flip flop: It has two inputs D and clock, when clock is 1, value of D is stored in flip

    flop and output Q occurs. When clock is 0, previously stored bit is maintained and

    output appears at Q.

    SR Flip flop : It hasthree inputs S,R,clock , when clock is 1, inputs S and R are examined

    , if S is 1 ,1 is stored. If R is 1, 0 is stored. If both S and R is 0, there is no change. If both

    are 1 behavior is undefined. Thus S stands for set and R for reset.

    Fig 2.6 Sequential components

    2.3.2 RT level sequential components:

    A register , shift register and counters are designed using RT level synthesis, In

    which , a register stores n bits from its n-bit data input I with those stored bits

    appearing at its output Q and bits are stored in parallel.

    A shift register stores n bits, but these bits cannot be stored in parallel , instead

    they are shifted into the registers serially . A shift register has one data input I

    and two control inputs clock and shift.

    A counter is a register that can also increment add one binary bit to its stored

    binary value. A synchronous input value only has an effect during a clock edge. Anasynchronous inputs value affects the circuit independent of the clock. All these

    are shown in figure 2.6

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    39/270

    Embedded System Design 06EC82

    ECE, SJBIT

    3

    2.3.3 Sequential logic design

    Sequential logic design can be achieved using a straight forward technique

    which is illustrated below

    Fig 2.7 (a) (b)( c)( d) sequential logic design

    Fig 2.7 (e) (f) sequential logic design

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    40/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    2.4 Custom single purpose processor design:

    A basic processor consists of a controller and a data path . The datapath stores and

    manipulates a systems data controller carries out the configuration of the datapath

    and sets the datapath control inputs like register load mux select signals functional unitsand connection units to obtain desired configuration of the datapath.

    Fig 2.8 A basic processor(a) controller and datapath

    (b) view inside the controller and datapath

    Example program :

    First create algorithm

    Convert algorithm to complex state machine

    Known as FSMD: finite-state machine with datapath Can use templates to perform such conversion

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    41/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Fig : 2.9 Example program GCD

    Create a register for any declared variable

    Create a functional unit for each arithmetic operation

    Connect the ports, registers and functional units

    Based on reads and writes

    Use multiplexors for multiple sources

    Create unique identifier

    for each datapath component control input and output

    Templates for creating state diagram :

    We finished the datapath

    We have a state table for the next state and control logic

    All thats left is combinational logic design This is notan optimized design, but we see the basic steps

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    42/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Fig 2.10 : Templates for creating state diagram

    2.5 RT level Custom Single Purpose processor Design:

    We often start with a state machine

    Rather than algorithm

    Cycle timing often too central to functionality

    Example

    Bus bridge that converts 4-bit bus to 8-bit bus

    Start with FSMD

    Known as register-transfer (RT) level Exercise: complete the design

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    43/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Fig 2.13 RT level Custom Single Purpose processor Design example

    2.6 Optimizing Custom single-purpose processors

    Optimization is the task of making design metric values the best

    possible

    Optimization opportunities

    original program

    FSMD datapath

    FSM

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    44/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Optimizing the original program

    Analyze program attributes and look for areas of possible

    improvement

    number of computations size of variable

    time and space complexity

    operations used

    multiplication and division very expensive

    Fig 2.15 optimizing the program

    Optimizing the FSMD:

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    45/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Areas of possible improvements

    merge states

    states with constants on transitions can be eliminated,

    transition taken is already known states with independent operations can be merged

    separate states

    states which require complex operations (a*b*c*d) can be

    broken into smaller states to reduce hardware size

    scheduling

    Fig 2.16 optimizing the FSDM for GCD Optimizing the datapath:

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    46/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Sharing of functional units

    one-to-one mapping, as done previously, is not necessary

    if same operation occurs in different states, they can share a single

    functional unit

    Multi-functional units ALUs support a variety of operations, it can be shared among

    operations occurring in different states

    Optimizing the FSM:

    State encoding

    task of assigning a unique bit pattern to each state in an FSM

    size of state register and combinational logic vary

    can be treated as an ordering problem

    State minimization

    task of merging equivalent states into a single state

    state equivalent if for all possible input combinations the

    two states generate the same outputs and transitions to

    the next same state

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    47/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    GENENRAL PURPOSE PROCESSORS : SOFTWARE

    A General-Purpose Processor is a

    Processor designed for a variety of computation tasks

    Low unit cost, in part because manufacturer spreads NRE overlarge numbers of units

    Motorola sold half a billion 68HC05 microcontrollers in

    1996 alone

    Carefully designed since higher NRE is acceptable

    Can yield good performance, size and power

    Low NRE cost, short time-to-market/prototype, high flexibility

    User just writes software; no processor design

    a.k.a. microprocessor micro used when they wereimplemented on one or a few chips rather than entire rooms

    Basic Architecture:

    A general purpose processor sometimes called a CPU consists of datapath

    and a control unit linked with memory.

    Control unit and datapath

    Note similarity to single-purpose processor

    Key differences

    Datapath is general

    Control unit doesnt store the algorithm the algorithm is

    programmed into the memory

    Datapath Operations:

    Load

    Read memory location into register

    ALU operation

    Input certain registers through ALU, store back in register

    Store

    Write register to memory location

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    48/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Fig 2.17 GPP basic architecture

    Control unit :

    Control unit: configures the datapath operations

    Sequence of desired operations (instructions) stored in memory

    program

    Instruction cycle broken into several sub-operations, each one clock cycle, e.g.:

    Fetch: Get next instruction into IR

    Decode: Determine what the instruction means

    Fetch operands: Move data from memory to datapath register Execute: Move data through the ALU

    Store results: Write data from register to memory

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    49/270

    Embedded System Design 06EC82

    ECE, SJBIT

    4

    Control Unit Sub-Operations:

    Fetch

    Get next instruction into IR

    PC: program counter, always points to next instruction IR: holds the fetched instruction

    Decode

    Determine what the instruction means

    Fetch operands

    Move data from memory to datapath register

    Execute

    Move data through the ALU

    This particular instruction does nothing during this sub-operation

    Store results Write data from register to memory

    This particular instruction does nothing during this sub-operation

    Memory:

    Program information consists of the sequence of instructions that cause the processor

    to carry out the desired system functionality. Data information represents the values

    being input, output and transformed by the program. We can store program and data

    together or separately..

    In a Princeton architecture,data and program words share the same memory space. The

    Princeton architecture may result in a simpler hardware connection to memory, since

    only one connection is necessary.

    In a Harvard architecture, the program memory space is distinct from the data memory

    space. A Harvard architecture,while requiring two connections, can perform instruction

    and data fetches simultaneously, so may result in improved performance.

    Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvardarchitecture.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    50/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton

    Memory may be read-only memory (ROM) or readable and writable memory

    (RAM). ROM is usually much more compact than RAM. An embedded system often uses

    ROM for program memory, since, unlike in desktop systems, an embedded systems

    program does not change. Constant-data may be stored in ROM, but other data of

    course requires RAM.

    Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the

    processor, while off-chip memory resides on a separate IC. The processor can usually

    access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but

    finite IC capacity of course implies only a limited amount of on-chip memory.

    Figure 2.20: Cache memory

    To reduce the time needed to access (read or write) memory, a local copy of a portion

    of memory may be kept in a small but especially fast memory called cache. Cache

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    51/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    memory often resides on-chip, and often uses fast but expensive static RAM technology

    rather than slower but cheaper dynamic RAM. Cache memory is based on the principle

    that if at a particular time a processor accesses a particular memory location, then the

    processor will likely access that location and immediate neighbors of the location in the

    near future.

    Operation:

    Instruction execution:

    1. Fetch instruction: the task of reading the next instruction from memory into

    the instruction register.

    2. Decode instruction: the task of determining what operation the instruction

    in the instruction register represents (e.g., add, move, etc.).

    3. Fetch operands: the task of moving the instructions operand data intoappropriate registers.

    4. Execute operation: the task of feeding the appropriate registers through the

    ALU and back into an appropriate register.

    5. Store results: the task of writing a register into memory.

    If each stage takes one clock cycle, then we can see that a single instruction may take

    several cycles to complete.

    Pipelining

    Pipelining is a common way to increase the instruction throughput of a microprocessor.

    We first make a simple analogy of two people approaching the chore of washing and

    drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the

    second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach

    requires 16 minutes. The approach is clearly inefficient since at any time only one

    person is working and the other is idle. Obviously, a better approach is for the second

    person to begin drying the first dish immediately after it has been washed. This

    approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8

    more minutes until the last dish is finally dry . We refer to this latter approach as

    pipelined.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    52/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Figure 2.21: Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,

    (c) pipelined instruction execution.

    Each dish is like an instruction, and the two tasks of washing and drying are like the five

    stages listed above. By using a separate unit (each akin a person) for each stage, we can

    pipeline instruction execution. After the instruction fetch unit etches the first

    instruction, the decode unit decodes it while the instruction fetch unit simultaneously

    fetches the next instruction.

    Superscalar and VLIW Architectures:

    Performance can be improved by:

    Faster clock (but theres a limit)

    Pipelining: slice up instruction into stages, overlap stages

    Multiple ALUs to support more than one instruction stream

    Superscalar

    Scalar: non-vector operations

    Fetches instructions in batches, executes as many as

    possible

    May require extensive hardware to detect

    independent instructions

    VLIW: each word in memory has multiple independent

    instructions

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    53/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Relies on the compiler to detect and schedule

    instructions

    Currently growing in popularity

    Programmers View

    Programmer doesnt need detailed understanding of architecture

    Instead, needs to know what instructions can be executed

    Two levels of instructions:

    Assembly level

    Structured languages (C, C++, Java, etc.)

    Most development today done using structured languages But, some assembly level programming may still be necessary

    Drivers: portion of program that communicates with and/or controls

    (drives) another device

    Often have detailed timing considerations, extensive bit

    manipulation

    Assembly level may be best for these

    Fig 2.22 Instruction stored in memory

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    54/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Instruction Set:

    Defines the legal set of instructions for that processor

    Data transfer: memory/register, register/register, I/O, etc.

    Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1

    Addressing Modes:

    Fig 2.23 Addressing modes

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    55/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Fig 2.24 A Simple (Trivial) Instruction Set

    Program and data memory spaceThe embedded systems programmer must be aware of the size of the available memory

    for program and for data. The programmer must not exceed these limits. In addition,

    the programmer will probably want to be aware of on-chip program and data memory

    capacity, taking care to fit the necessary program and data in on-chip memory if

    possible.

    RegistersThe assembly-language programmer must know how many registers are available for

    general-purpose data storage. For example, a base register may exist, which permits the

    programmer to use a data-transfer instruction where the processor adds an operand

    field to the base register to obtain an actual memory address.

    I/OThe programmer should be aware of the processors input and output (I/O) facilities,

    with which the processor communicates with other devices. One common I/O facility is

    parallel I/O, in which the programmer can read or write a port (a collection of external

    pins) by reading or writing a special-function register. Another common I/O facility is a

    system bus, consisting of address and data ports that are automatically activated by

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    56/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    certain addresses or types of instructions.

    Interrupts

    An interrupt causes the processor to suspend execution of the main program, andinstead jump to an Interrupt Service Routine (ISR) that fulfills a special, short-term

    processing need. In particular, the processor stores the current PC, and sets it to the

    address of the ISR. After the ISR completes, the processor resumes execution of the

    main program by restoring the PC.The programmer should be aware of the types of

    interrupts supported by the processor (we describe several types in a subsequent

    chapter), and must write ISRs when necessary. The assembly-language programmer

    places each ISR at a specific address in program memory. The structured-language

    programmer must do so also; some compilers allow a programmer to force a procedure

    to start at a particular memory location, while recognize pre-defined names for

    particular ISRs.For example, we may need to record the occurrence of an event from a peripheral

    device, such as the pressing of a button. We record the event by setting a variable in

    memory when that event occurs, although the users main program may not process

    that event until later. Rather than requiring the user to insert checks for the event

    throughout the main program, the programmer merely need write an interrupt service

    routine and associate it with an input pin connected to the button. The processor will

    then call the routine automatically when the button is pressed.

    Operating System

    Optional software layer providing low-level services to a program (application).

    File management, disk access

    Keyboard/display interfacing

    Scheduling multiple programs for execution

    Or even just multiple threads from one program

    Program makes system calls to the OS

    Development Environment

    Development processor

    The processor on which we write and debug our programs

    Usually a PC

    Target processor

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    57/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    The processor that the program will run on in our embedded system

    Often different from the development processor

    Software Development Process

    Compilers Cross compiler

    Runs on one processor, but generates code for another

    Assemblers

    Linkers

    Debuggers

    Profilers

    Fig 2.25 Software Development Process

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    58/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Running a Program: If development processor is different than target, how can we run our compiled

    code? Two options:

    Download to target processor

    Simulate

    Simulation

    One method: Hardware description language

    But slow, not always available

    Another method: Instruction set simulator (ISS)

    Runs on development processor, but executes instructions of target

    processor

    Testing and Debugging: ISS

    Gives us control over time set breakpoints, look at register values, setvalues, step-by-step execution, ...

    But, doesnt interact with real environment

    Download to board

    Use device programmer

    Runs in real environment, but not controllable

    Compromise: emulator

    Runs in real environment, at speed or near

    Supports some controllability from the PC

    Fig 2.26 software design process

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    59/270

    Embedded System Design 06EC82

    ECE, SJBIT

    5

    Application-Specific Instruction-Set Processors (ASIPs):

    General-purpose processors

    Sometimes too general to be effective in demanding application

    e.g., video processing requires huge video buffers andoperations on large arrays of data, inefficient on a GPP

    But single-purpose processor has high NRE, not programmable

    ASIPs targeted to a particular domain

    Contain architectural features specific to that domain

    e.g., embedded control, digital signal processing, video

    processing, network processing, telecommunications, etc.

    Still programmable

    A Common ASIP: Microcontroller

    For embedded control applications

    Reading sensors, setting actuators

    Mostly dealing with events (bits): data is present, but not in huge

    amounts

    e.g., VCR, disk drive, digital camera (assuming SPP for image

    compression), washing machine, microwave oven

    Microcontroller features

    On-chip peripherals Timers, analog-digital converters, serial communication, etc.

    Tightly integrated for programmer, typically part of register

    space

    On-chip program and data memory

    Direct programmer access to many of the chips pins

    Specialized instructions for bit-manipulation and other low-level

    operations

    Digital Signal Processors (DSP)

    For signal processing applications

    Large amounts of digitized data, often streaming

    Data transformations must be applied fast

    e.g., cell-phone voice filter, digital TV, music synthesizer

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    60/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    DSP features

    Several instruction execution units

    Multiple-accumulate single-cycle instruction, other instrs.

    Efficient vector operations e.g., add two arrays

    Vector ALUs, loop buffers, etc.

    Selecting a Microprocessor

    Issues

    Technical: speed, power, size, cost

    Other: development environment, prior expertise, licensing, etc.

    Speed: how evaluate a processors speed?

    Clock speed but instructions per cycle may differ Instructions per second but work per instr. may differ

    Dhrystone: Synthetic benchmark, developed in 1984.

    Dhrystones/sec.

    MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digitals

    VAX 11/780). A.k.a. Dhrystone MIPS. Commonly used today.

    So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per

    second

    SPEC: set of more realistic benchmarks, but oriented to desktops

    EEMBC EDN Embedded Benchmark Consortium, Suites of benchmarks: automotive, consumer electronics,

    networking, office automation, telecommunications

    Designing a General Purpose Processor Not something an embedded system designer normally would do

    But instructive to see how simply we can build one top down

    Remember that real processors arent usually built this way Much more optimized, much more bottom-up design

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    61/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    Fig:2.27 A simple microprocessor

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    62/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    RECOMMENDED QUESTIONS

    UNIT 2

    ( Hardware)

    1. What is single purpose processor? What are the benefits of choosing a

    single purpose processor over a general purpose processor.?

    2. How do nMOS and pMOS transistors differ?

    3. Build a 3-input NAND gate using a minimum number of CMOS transistors.

    4. Build a 3-input NOR gate using a minimum number of CMOS transistors.

    5. Build a 2-input AND gate using a minimum number of CMOS transistors.

    6. Build a 2-input OR gate using a minimum number of CMOS transistors.

    7. Explain why NAND and NOR gates are more common than AND and OR

    gates.

    8. Distinguish between combinational and sequential circuit.

    9. Design a 2-bit comparator with single output less than using

    combinational design technique.

    10.Design a 3 X 8 decoder with truth table and K-maps.

    11.What is the difference between synchronous and asynchronous circuit?

    12.What is the purpose of datapath and control path?

    13.Design a single purpose processor that outputs Fibonacci numbers upto nplaces. Start with a function computing the desired result, translate it into

    state diagram and sketch a probable datapath.

    UNIT 2

    ( Software)1. Describe why a general purpose processor could cost less than a single

    purpose processor.

    2. Create a table listing the address spaces for 8 ,16, 24,32, 64 bit address

    sizes.

    3. Illustrate how program and data memory fetches can be overlapped in a

    Harvard architecture.

    4. For a microcontroller create a table listing Five existing variations stressing

    the features that differ from the basic version.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    63/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    QUESTION PAPER SOLUTION

    UNIT 2

    Q1. Write an algorithm for GCD with more time complexity and write theFSDM and also determine total number of steps required for GCD.

    First create algorithm

    Convert algorithm to complex state machine

    Known as FSMD: finite-state machine with datapath

    Can use templates to perform such conversion

    GCD

    Create a register for any declared variable

    Create a functional unit for each arithmetic operation

    Connect the ports, registers and functional units

    Based on reads and writes

    Use multiplexors for multiple sources

    Create unique identifier

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    64/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    for each datapath component control input and output

    Templates for creating state diagram :

    We finished the datapath

    We have a state table for the next state and control logic

    All thats left is combinational logic design

    This is notan optimized design, but we see the basic steps

    Templates for creating state diagram

    Q2. Explain the different methods to optimize the FSDM .

    Optimization is the task of making design metric values the best

    possible

    Optimization opportunities

    original program

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    65/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    FSMD

    datapath

    FSM

    Optimizing the original program

    Analyze program attributes and look for areas of possible

    improvement

    number of computations

    size of variable

    time and space complexity operations used

    multiplication and division very expensive

    Q3. Explain the different memory architecturesProgram information consists of the sequence of instructions that cause the processor

    to carry out the desired system functionality. Data information represents the values

    being input, output and transformed by the program. We can store program and data

    together or separately..

    In a Princeton architecture,data and program words share the same memory space. The

    Princeton architecture may result in a simpler hardware connection to memory, since

    only one connection is necessary.

    In a Harvard architecture, the program memory space is distinct from the data memory

    space. A Harvard architecture,while requiring two connections, can perform instruction

    and data fetches simultaneously, so may result in improved performance.

    Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvard

    architecture.

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    66/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton

    Memory may be read-only memory (ROM) or readable and writable memory

    (RAM). ROM is usually much more compact than RAM. An embedded system often uses

    ROM for program memory, since, unlike in desktop systems, an embedded systems

    program does not change. Constant-data may be stored in ROM, but other data of

    course requires RAM.

    Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the

    processor, while off-chip memory resides on a separate IC. The processor can usually

    access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but

    finite IC capacity of course implies only a limited amount of on-chip memory.

    Q4. Explain pipelining for instruction execution with dish cleaning.

    Pipelining is a common way to increase the instruction throughput of a microprocessor.

    We first make a simple analogy of two people approaching the chore of washing and

    drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the

    second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach

    requires 16 minutes. The approach is clearly inefficient since at any time only one

    person is working and the other is idle. Obviously, a better approach is for the second

    person to begin drying the first dish immediately after it has been washed. This

    approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8

    more minutes until the last dish is finally dry .

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    67/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    : Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,

    (c) pipelined instruction execution.

    Each dish is like an instruction, and the two tasks of washing and drying are like the five

    stages listed above. By using a separate unit (each akin a person) for each stage, we can

    pipeline instruction execution. After the instruction fetch unit etches the first

    instruction, the decode unit decodes it while the instruction fetch unit simultaneously

    fetches the next instruction.

    Q5. Explain the software development process.

    Software Development Process Compilers

    Cross compiler

    Runs on one processor, but generates code for another

    Assemblers

    Linkers

    Debuggers

    Profilers

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    68/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    Fig 2.25 Software Development Process

    Running a Program: If development processor is different than target, how can we run our compiled

    code? Two options:

    Download to target processor

    Simulate

    Simulation

    One method: Hardware description language

    But slow, not always available

    Another method: Instruction set simulator (ISS)

    Runs on development processor, but executes instructions of target

    processor

    Testing and Debugging: ISS Gives us control over time set breakpoints, look at register values, set

    values, step-by-step execution, ...

    But, doesnt interact with real environment

    Download to board

    Use device programmer

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    69/270

    Embedded System Design 06EC82

    ECE, SJBIT

    6

    Runs in real environment, but not controllable

    Compromise: emulator

    Runs in real environment, at speed or near

    Supports some controllability from the PC

    software design process

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    70/270

    Embedded System Design 06EC82

    ECE, SJBIT

    7

    optimizing the program

    Optimizing the FSMD:

    Areas of possible improvements

    merge states

    states with constants on transitions can be eliminated,

    transition taken is already known states with independent operations can be merged

    separate states

    states which require complex operations (a*b*c*d) can be

    broken into smaller states to reduce hardware size

    scheduling

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    71/270

    Embedded System Design 06EC82

    ECE, SJBIT

    7

    optimizing the FSDM for GCD Optimizing the datapath:

    Sharing of functional units

    one-to-one mapping, as done previously, is not necessary

    if same operation occurs in different states, they can share a single

    functional unit

    Multi-functional units

    ALUs support a variety of operations, it can be shared amongoperations occurring in different states

    Optimizing the FSM:

    State encoding

  • 8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

    72/270

    Embedded System Desi