Ece Viii Embedded System Design [06ec82] Notes

8/22/2019 Ece Viii Embedded System Design [06ec82] Notes

1/270

Embedded System Design 06EC82

ECE, SJBIT

Sub: Embedded System Design Sub code: 06EC82

Sem: VIII

PART A

UNIT- 1INTRODUCTION: Overview of embedded systems, embedded system design challenges,

common design metrics and optimizing them. Survey of different embedded system design

technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose

processor 4 Hours

UNIT2

SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT

level Combinational and Sequential Components, Optimizing single-purpose processors. Single-

Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development

Environment, ASIPS. 6 Hours

UNIT3

Standard Single-Purpose Peripherals, Timers, Counters, UART, PWM, LCD Controllers,

Keypad controllers, Stepper Motor Controller, A to D Converters, Examples. 6 Hours

UNIT4

MEMORY: Introduction, Common memory Types, Compulsory memory, Memory Hierarchy

and Cache, Advanced RAM. Interfacing, Communication Basics, Microprocessor Interfacing,

Arbitration, Advanced Communication Principles, Protocols - Serial, Parallel 8 Hours

PART - B

UNIT - 5

INTERRUPTS: Basics - Shared Data Problem - Interrupt latency. Survey of Software

Architecture, Round Robin, Round Robin with Interrupts - Function Queues - scheduling -

RTOS architecture. 8 Hours

UNIT6

INTRODUCTION TO RTOS:MORE OS SERVICES: Tasks - states - Data - Semaphoresand shared data. More operating systems services - Massage Queues - Mail Boxes -Timers Events - Memory Management. 8 Hours

UNIT7 & 8

Basic Design Using RTOS:Principles- An example, Encapsulating semaphores and Queues.

Hard real-time scheduling considerationsSaving Memory space and power. Hardware

software co-design aspects in embedded systems. 12 Hours


2/270


ECE, SJBIT

INDEX SHEET

SL.NO TOPIC PAGE NO.

UNIT - 1

INTRODUCTION: Overview of embedded systems8 to 32

01Embedded systems overview

8 to 9

02 Design challenges, common design metrics 9 to 12

03

Processor technology 13 to 15

04IC technology 16 to 19

05Design Technology 19 to 20

06Tradeoffs 21 to 22

07 Recommended questions and solutions23 to 31

UNIT - 2

CUSTOM SINGLE-PURPOSE PROCESSORS

33 to 72

01

HARDWARE:

Introduction, combinational logic

32 to 36

02Sequential logic 36 to 38

03 Custom single purpose processor design 39 to 40


3/270


ECE, SJBIT

04RT level processor design 40 to 42

05Optimizing custom processors 42 to 44

06

SOFTWARE:

Basic architecture45 to 49

07operation 50 to 51

08 Programmers view 51 to 55

09 Development environment55 to 57

10 ASIPs57 to 60


UNIT - 3

Standard Single Purpose Processors : Peripherals73 to 88

01Introduction, timers, counters watchdog timers 73 to 74

02UART,PWM, 75 to 76

03LCD controllers ,Stepper Motor controllers 77 to 79


4/270


ECE, SJBIT

04Analog to Digital converters ,RTC 80 to 81


UNIT - 4

Memory and Microprocessor interfacing89 to 153

01Intro, Memory write ability 89 to 91

02Common memory types 92 to 98

03Composing memory 98 to 99

04Memory hierarchy and cache 99 to 105

05 Advanced RAM 105 to 108

06Communication basics 109 to 113

07Microprocessor interfacing 114 to 121

08Arbitration 122 to 125

09Multilevel Bus architectures 125 to 126


5/270


ECE, SJBIT

10Advanced communication principles 126 to 132

11Recommended questions and solutions 133 to 152

UNIT - 5

INTERRUPTS and Survey of software architecture154 to 174

01Shared Date problem 154 to 157

02Round robin 157 to 161

03Function queues 161 to 162

04RTOS architecture 162 to 166

05 Recommended questions and solutions 167 to 174

UNIT - 6

INTRODUCTION TO RTOS , MORE ON OS SERVICES175 to 230

01Tasks , states data 175 to 183

02Semaphores 184 to 195

03Messages queues, mail boxes 195 to 209

04 Events , memory management209 to 219


6/270


ECE, SJBIT


UNIT

7 & 8

BASIC DESIGN USING RTOS

231 to 270

01Principles 230 to 234

02Encapsulating semaphores 234 to 258

03Hard real time scheduling considerations 258 to 258

04Saving memory and power 258 to 260



7/270


ECE, SJBIT

PART A

UNIT- 1

INTRODUCTION: Overview of embedded systems, embedded system design challenges,

common design metrics and optimizing them. Survey of different embedded system design

technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose

processors.

4 Hours

TEXT BOOKS:

1. Embedded System Design: A Unified Hardware/Software Introduction - Frank

Vahid, Tony Givargis, John Wiley & Sons, Inc.2002

2. An Embedded software Primer - David E. Simon: Pearson Education, 1999

REFERENCE BOOKS:

1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008

2. Embedded Systems Architecture A Comprehensive Guide for Engineers and

Programmers, Tammy Noergaard, Elsevier Publication, 2005

3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).


8/270


ECE, SJBIT

EMBEDED SYSTEM DESIGN

UNIT 1

INTRODUCTION1.1Embedded systems overview

1.2Design Challenges

1.3Processor Technology

1.4IC Technology

1.5Design Technology

1.1. Embedded systems overview

An embedded system is nearly any computing system other than a desktop computer. Anembedded system is a dedicated system which performs the desired function upon power up,repeatedly.

Embedded systems are found in a variety of common electronic devices such as consumerelectronics ex. Cell phones, pagers, digital cameras, VCD players, portable Video games,calculators, etc.,

Embedded systems are found in a variety of common electronic devices, such as:(a)consumer electronics -- cell phones, pagers, digital cameras, camcorders, videocassetterecorders, portable video games, calculators, and personal digital assistants; (b) home appliances-- microwave ovens, answering machines, thermostat, home security, washing machines, andlighting systems; (c) office automation -- fax machines, copiers, printers, and scanners; (d)business equipment -- cash registers, curbside check-in, alarm systems, card readers, productscanners, and automated teller machines; (e) automobiles --transmission control, cruise control,fuel injection, anti-lock brakes, and active suspension.

Common characteristics of Embedded systems :

Embedded systems have several common characteristics that distinguish such system from othercomputing systems;

1. Single functioned :An Embedded system executes a single program repeatedly. Theentire program is executed in a loop over and over again.

2. Tightly coupled (constrained):It should cost less, perform fast enough to process data inreal time, must fit in a single chip, consume as much less power as possible, etc.

3. Reactive and real time: Embedded Systems should continuously react to changes in theenvironment. It should also process and compute data in real time without delay.


9/270


ECE, SJBIT

Fig 1.1 An embedded system example a digital camera

1.2 Design challenge

Design matrics:

A Design metric is a measure of implementations features such as cost, size, performance andpower. Embedded system

- must cost less

- must be sized to fit on a single chip.

- must perform in real time (response time)

- must consume minimum power

The embedded system designer must be designed to meet the desired functionality. Apartmeeting the functionality, should also consider optimizing numerous design metrics.

common design metrics that a design engineer should consider :

- NRE( non recurring engineering Cost) : The one time monetary cost of designing thesystem.

- Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.


10/270


ECE, SJBIT

1

- Size: physical space required by the system. Often measured in terms of bytes in case ofsoftware, and no. of gates in terms of hardware.

- Performance: execution/response time of the system.

- Power: The amount of power consumed by the system, which may define lifetime of the

battery and cooling requirement of IC. More power means more heat.

- Flexibility: ability to change the functionality of the system.

- Time to prototype: time needed to build a working system w/o incurring heavy NRE.

- Time to market: time required to develop & released to the market.

- Maintainability: ability to modify the system after its release to the market.

- Correctness: our confidence that we have implemented systems functionality correctly.

- Safety: probability that the system does not cause any harm.

Metrics typically compete with one another: improving one often leads to worsening ofanother

Fig : 1.2 Design metric competition

1.2.1 Time to Market Design Metric :

- The time to market: Introducing an embedded system early to the market can make bigdifference in terms of systems profitability. Market windows generally will be very


11/270


ECE, SJBIT

1

narrow, often in the order of few months. Missing this window can mean significant lossin sales.

Fig 1.3 Time to Market

(A) Market window (B) simplified revenue model for computing revenue loss

Lets investigate the loss of revenue that can occur due to delayed entry of a product in themarket. We can use a simple triangle model y axis is the market rise, x axis to represent the pointof entry to the market. The revenue for an on time market entry is the area of the trianglelabeled on time and the revenue for a delayed entry product is the area of the triangle labeledDelayed. The revenue loss for a delayed entry is the difference of these triangles areas.

% revenue loss = ((on timeDelayed)/on time)*100 %

The area of on time triangle = * base * height

W -- height the market raise

D -- Delayed entry ( in terms of weeks or months )

2Wproducts life time

Area of on time triangle = *2W*W

Area of delayed triangle=1/2*(W-D+W)*(W-D)

%age revenue loss = (D (3W- D)/2W*W) * 100 %

Ex: products life time is 52 weeks

Delay of entry to the market is 4 weeks

Percentage revenue loss = 22%


12/270


ECE, SJBIT

1

1.2.2 The NRE and Unit cost Design metrics:

Unlike other design metric the best technology choice will depend on the no of units.

Tech. A would result in NRE cost $2000 unit cost $100

B $30000 $30

C 100000 $2

Total cost= NRE cost + unit cost* no of units

Per product cost = total cost/no of units

= NRE cost/no of units + unit cost

1.2.3 The performance Design metric:

Performance of a system is a measure of how long the system takes to execute our desiredtasks. There are several measures of performance. The two main measures are

Latency or response time

Throughput : no of tasks that are processed in unit

speed up is a method of comparing performance of two systems

Speed up of A over B = performance of A/performance of B.

Technologies used in embedded systems:

Technology is a manner of accomplishing a task. There are three types of technologies arecentral to embedded system design:

Processor technologies

IC technologies

Design technologies

Processor technology: relates to architecture of the computation engine use to implement asystems desired functionalities. Generally the term processor is associated with programmable

software processors. But many non programmable digital systems can be thought of asprocessors.

Single purpose processors: is a digital system designed to execute exactly only one function.Performance may be good, flexibility may be poor.


13/270


ECE, SJBIT

1

Application specific processor: may serve as a compromise between single purpose and generalpurpose. An ASIP is a programmable processor optimized for particular class of applicationshaving common characteristics, such as embedded control, digital signal processing, ortelecommunications. This provides flexibility, while achieving good performance, low powerand size.

General purpose processors: The designer of a general purpose or microprocessor, builds aprogrammable device that is suitable for a variety to maximize the sale.

Design considerations

Should accommodate different kinds of program

Should provide general data path to handle variety ofcomputations

Design technology: design technology involves converting our concepts of desiredfunctionalities into an implementation. Design implementations should optimize design metrics

and should also realize faster.

Variations of top down design process have become popular

1.3.1 Processor Technologies:

1. General Purpose ProcessorsSoftware

2. Single Purpose ProcessorsHardware

3.Application Specific Processors:Application specific Instruction set processors (ASIP)

1. General Purpose Processors Software

They are programmable devices used in a variety of applications. They are also known asmicroprocessors. They have a program memory and a general data path with a large register file

and general ALU. The data path must be large enough to handle a variety ofcomputations. The programmer writes the program to carry out the required functionalityin the program memory and uses the features (instructions) provided by the general data

path. This is called as thesoftware portion of the system. The benefits of such a processorare very high. They require Low time-to-market and have low NRE costs. They provide ahigh flexibility.

Design time andNRE costare low, because the designer must only write a program, but need notdo any digital design. Flexibility is high, because changing functionality requires only changingthe program. Unit cost may be relatively low in small quantities, since the processormanufacturer sells large quantities to other customers and hence distributes the NRE cost over


14/270


ECE, SJBIT

1

many units. Performance may be fast for computation-intensive applications, if using a fastprocessor, due to advanced architecture features and leading edge IC technology.

some design-metric drawbacks : Unit costmay be too high for large quantities. Performancemay be slow for certain applications. Size andpowermay be large due to unnecessary processor

hardware.Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,representing an exact fit of the desired functionality, nothing more, nothing less.

Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-

purpose processor, (b) application-specific processor, (c)single-purpose processor.


15/270


ECE, SJBIT

1

Fig 1.5 Implementing desired functionality on different General purpose processor

2. Single Purpose Processors Hardware:

This is a digital circuit designed to execute exactly one program. Its features are, it contains onlythe components needed to execute a single program; it contains no program memory. Usercannot change the functionality of the chip. They are fast, low powered and small sized.

An embedded system designer creates a single-purpose processor by designing a custom digitalcircuit. Using a single-purpose processor in an embedded system results in several design metricbenefits and drawbacks, which are essentially the inverse of those for general purposeprocessors. Performance may be fast, size and power may be small, and unit-cost may be low for

large quantities, while design time and NRE costs may be high, flexibility is low, unit cost maybe high for small quantities, and performance may not match general-purpose processors forsome applications.


16/270


ECE, SJBIT

1

Fig 1.6 Implementing desired functionality on different single purpose processor

3.Application Specific Processors:Application specific Instruction set processors (ASIP):

They are programmable processors optimized for a particular class of applications havingcommon characteristics. They strike a compromise between general-purpose and single-purposeprocessors. They have a program memory, an optimized data path and special functional units.They have good performance, some flexibility, size and power.

An application-specific instruction-set processor (or ASIP) can serve as a compromise betweenthe above processor options. An ASIP is designed for a particular class of applications withcommon characteristics, such as digital-signal processing, telecommunications, embeddedcontrol, etc. The designer of such a processor can optimize the datapath for the application class,

perhaps adding special functional units for common operations, and eliminating otherinfrequently used units.


17/270


ECE, SJBIT

1

Fig 1.7 Implementing desired functionality on different Application Specific processor

Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. ADSP is a processor designed to perform common operations on digital signals, which are thedigital encodings of analog signals like video and audio. These operations carry out commonsignal processing tasks like signal filtering, transformation,or combination. Such operations areusually math-intensive, including operations like multiply and add or shift and add. To supportsuch operations, a DSP may have special purpose datapath components such a multiply-accumulate unit, which can perform a computation like T = T + M[i]*k using only one

instruction. Because DSP programs often manipulate large arrays of data, a DSP may alsoinclude special hardware to fetch sequential data memory locations in parallel with otheroperations, to further speed execution.

Highlight merits and demerits of single purpose processors and general-purpose processors.

Single Purpose Processors:

Merits:

1. They are fast

2. They consume low power

3. They have small size

4. Unit cost may be low for large quantities


18/270


ECE, SJBIT

1

Demerits:

1. NRE costs may be high

2. Low flexibility

3. Unit cost high for small quantities

4. Performance may not match for some applications

General Purpose Processors:

Merits:

1. High Flexibility

2. Low NRE costs

3. Low time to market

4. Performance may be for fast and high-intensive computations.

De-Merits:

1. Unit cost may be relatively high for large quantities.

2. Performance may be slower for certain applications.

3. Size and power may be large due to unnecessary processor hardware.

How a single purpose processor is distinctly different from a general-purpose processor?

Sl.No

.Single Purpose Processor General Purpose Processor

1. Executes exactly one program. Executes any program written by the user.

2.The functionality cannot be changed.

The functionality can be changed by theuser by writing the required program.

Sl.No.

Single Purpose Processor General Purpose Processor

3. They do not have program memory They have program memory

4. Do not have any flexibility and containresources required only for that particularfunctionality

Has a very large amount of resource whichmay or may not be used for a particularfunctionality as decided by the user


19/270


ECE, SJBIT

1

5. Merits include : They are fast, theyconsume low power, they have small sizeand the unit cost may be low for largequantities

Merits include : They have high Flexibility,Low NRE costs, Low time to market,Performance may be for fast and high-intensive computations.

1.4 IC Technology

Every processor must eventually be implemented on an IC. IC technology involves the mannerin which we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit),often called a chip, is a semiconductor device consisting of a set of connected transistors andother devices. A number of different processes exist to build semiconductors, the most popular ofwhich is CMOS (Complementary Metal Oxide Semiconductor). The IC technologies differ byhow customized the IC is for a particular implementation. IC technology is independent fromprocessor technology; any type of processor can be mapped to any type of IC technology.

Fig : 1. 8 The independence of processor and IC technologies: any processor technology can be

mapped to any IC technology.

To understand the differences among IC technologies, we must first recognize thatsemiconductors consist of numerous layers. The bottom layers form the transistors. The middlelayers form logic gates. The top layers connect these gates with wires. One way to create theselayers is by depositing photo-sensitive chemicals on the chip surface and then shining lightthrough masks to change regions of the chemicals. Thus, the task of building the layers isactually one of designing appropriate masks. A set of masks is often called a layout. Thenarrowest line that we can create on a chip is called the feature size, which today is well belowone micrometer (sub-micron).

1.4.1 Full-custom/VLSI

In a full-custom IC technology, we optimize all layers for our particular embedded systemsdigital implementation. Such optimization includes placing the transistors to minimizeinterconnection lengths, sizing the transistors to optimize signal transmissions and routing wiresamong the transistors. Once we complete all the masks, we send the mask specifications to a


20/270


ECE, SJBIT

2

fabrication plant that builds the actual ICs. Full-custom IC design, often referred to as VLSI(Very Large Scale Integration) design, has very high NRE cost and long turnaround times(typically months) before the IC becomes available, but can yield excellent performance withsmall size and power. It is usually used only in high-volume or extremely performance-criticalapplications.

1.4.2 Semi-custom ASIC (gate array and standard cell)

In an ASIC (Application-Specific IC) technology, the lower layers are fully or partially built,leaving us to finish the upper layers. In a gate array technology, the masks for the transistor andgate levels are already built (i.e., the IC already consists of arrays of gates). The remaining taskis to connect these gates to achieve our particular implementation. In a standard cell technology,logic-level cells (such as an AND gate or an AND-OR-INVERT combination) have their maskportions pre-designed, usually by hand. Thus, the remaining task is to arrange these portions into

complete masks for the gate level, and then to connect the cells. ASICs are by far the mostpopular IC technology, as they provide for good performance and size, with much less NRE costthan full-custom ICs.

1.4.3 PLD

In a PLD (Programmable Logic Device) technology, all layers already exist, so we can purchasethe actual IC. The layers implement a programmable circuit, where programming has a lower-level meaning than a software program. The programming that takes place may consist ofcreating or destroying connections between wires that connect gates, either by blowing a fuse, orsetting a bit in a programmable switch. Small devices, called programmers, connected to a

desktop computer can typically perform such programming. We can divide PLD's into two types,simple and complex. One type of simple PLD is a PLA (Programmable Logic Array), whichconsists of a programmable array of AND gates and a programmable array of OR gates. Anothertype is a PAL (Programmable Array Logic), which uses just one programmable array to reducethe number of expensive programmable components. One type of complex PLD, growing veryrapidly in popularity over the past decade, is the FPGA (Field Programmable Gate Array), whichoffers more general connectivity among blocks of logic, rather than just arrays of logic as withPLAs and PALs, and are thus able to implement far more complex designs. PLDs offer very lowNRE cost and almost instant IC availability. However, they are typically bigger than ASICs, mayhave higher unit cost, may consume more power, and may be slower (especially FPGAs). Theystill provide reasonable performance, though, so are especially well suited to rapid prototyping.

1.5 DESIGN TECHNOLOGY:

Design technology involves the manner in which we convert our concept of desired systemfunctionality into an implementation. We must not only design the implementation to optimisedesign metrics, but we must do so quickly.


21/270


ECE, SJBIT

2

Variations of a top-down design process have become popular in the past decade, an ideal formof which is illustrated in the figure. The designer refines the system through several abstractionlevels. At the system level the designer describes the desired functionality in an executablelanguage like C. This is called system specification.

The designer refines this specification by distributing portions of it among several general and/orsingle purpose processors, yielding behavioural specifications for each processor.

The designer refines these specifications into register-transfer (RT) specifications by convertingbehaviour on general-purpose processors to assembly code, and by converting behaviour onsingle purpose processors to a connection of register-transfer components and state machines.The designer then refines the RT level specification into a logic specification.

Finally, the designer refines the remaining specifications into an implementation consisting ofmachine code for general purpose processors and a design gate level net list for single purposeprocessors.

Fig 1.9 : Deal top-down design process, and productivity improvers.

There are three main approaches to improving the design process for increased productivity,which we label as compilation/synthesis, libraries/IP, and test/verification. Several otherapproaches also exist.


22/270


ECE, SJBIT

2

1.5.1 Compilation/Synthesis

Compilation/Synthesis lets a designer specify desired functionality in an abstract manner, andautomatically generates lower-level implementation details. Describing a system at highabstraction levels can improve productivity by reducing the amount of details, often by an order

of magnitude, that a design must specify.A logic synthesis tool converts Boolean expressions into a connection of logic gates (called anetlist). A register-transfer (RT) synthesis tool converts finite-state machines and register-transfers into a datapath of RT components and a controller of Boolean equations. A behavioralsynthesis tool converts a sequential program into finite-state machines and register transfers.Likewise, a software compiler converts a sequential program to assembly code, which isessentially register-transfer code. Finally, a system synthesis tool converts an abstract systemspecification into a set of sequential programson general and single-purpose processors.The relatively recent maturation of RT and behavioral synthesis tools has enabled a unified viewof the design process for single-purpose and general-purpose processors. Design for the former iscommonly known as hardware design, and design for the latter as software design. In the

past, the design processes were radically different software designers wrote sequentialprograms, while hardware designers connected components.

Fig 1.10 The co-design ladder: recent maturation of synthesis enables a unified view

of hardware and software.


23/270


ECE, SJBIT

2

1.5.2 Libraries/IP

Libraries involve re-use of pre-existing implementations. Using libraries of existingimplementations can improve productivity if the time it takes to find, acquire, integrate and test alibrary item is less than that of designing the item oneself. A logic-level library may consist of

layouts for gates and cells. An RT-level library may consist of layouts for RT components, likeregisters, multiplexors, decoders, and functional units. A behavioral-level library may consist ofcommonly used components, such as compression components, bus interfaces, displaycontrollers, and even general purpose processors. The advent of system-level integration hascaused a great change in this level of library.

1.5.3 Test/Verification

Test/Verification involves ensuring that functionality is correct. Such assurance can prevent time-consuming debugging at low abstraction levels and iterating back to high abstraction levels.Simulation is the most common method of testing for correct functionality, although more formal

verification techniques are growing in popularity. At the logic level, gate level simulatorsprovide output signal timing waveforms given input signal waveforms.Likewise, general-purpose processor simulators execute machine code. At the RT-level,hardware description language (HDL) simulators execute RT-level descriptions and provideoutput waveforms given input waveforms. At the behavioral level, HDL simulators simulatesequential programs, and co-simulators connect HDL and general purpose processor simulatorsto enable hardware/software co-verification. At the system level, a model simulator simulates theinitial system specification using an abstract computation model, independent of any processortechnology, to verify correctness andcompleteness of the specification.

1.5.4 More productivity improvers

There are numerous additional approaches to improving designer productivity. Standards focuson developing well-defined methods for specification, synthesis and libraries. Such standards canreduce the problems that arise when a designer uses multiple tools, or retrieves or providesdesign information from or to other designers. Common standards include language standards,synthesis standards and library standards.

Languages focus on capturing desired functionality with minimum designer effort. For example,the sequential programming language of C is giving way to the object oriented language of C++,which in turn has given some ground to Java. As another example, state-machine languages

permit direct capture of functionality as a set of states and transitions, which can then betranslated to other languages like C.

Frameworks provide a software environment for the application of numerous tools throughoutthe design process and management of versions of implementations. For example, a frameworkmight generate the UNIX directories needed for various simulators and synthesis tools,supporting application of those tools through menu selections in a single graphical user interface.


24/270


ECE, SJBIT

2

RECOMMENDED QUESTIONS

UNIT 1

Overview of embedded systems

1. What is an embedded system? Why is it so hard to define ES?

2. List and define the three main characteristics of embedded system that

distinguish such systems from other computing systems.

3. What is design metric?

4.

List a pair of design metrics that may compete with one another providing

an intuitive explanation of the reason behind it.

5. What is market window and why is it so important to reach the market

early in this window?

6. What is NRE cost?

7. List and define the three main processor technologies. What are the

benefits of using different processor technologies.

8. List the main IC technologies and list out the benefits.

9. List the three main design technologies and how is it helpful to designers.

10.Provide a definition of Moores law. 11.Compute annual growth rate of IC capacity and designer productivity.

12.What is design gap?

13.What I renaissance engineer and why is it so important in current

market?

14.Define what is meant by mythical man month.


25/270


ECE, SJBIT

2

QUESTION PAPER SOLUTION

UNIT 1

Q1.Highlight the merits and demerits of single purpose processors and general-

purpose processors.

Single Purpose Processors:

Merits:

5. They are fast

6. They consume low power

7. They have small size

8. Unit cost may be low for large quantities

Demerits:

5. NRE costs may be high

6. Low flexibility

7. Unit cost high for small quantities

8. Performance may not match for some applications

General Purpose Processors:

Merits:

5. High Flexibility

6. Low NRE costs

7. Low time to market

8. Performance may be for fast and high-intensive computations.

De-Merits:

4. Unit cost may be relatively high for large quantities.

5. Performance may be slower for certain applications.


26/270


ECE, SJBIT

2

6. Size and power may be large due to unnecessary processor hardware.

Q2.How a single purpose processor is distinctly different from a general-purpose processor?

Sl.No. Single Purpose Processor General Purpose Processor

1. Executes exactly one program. Executes any program written by the user.

2.The functionality cannot be changed.

The functionality can be changed by the user by

writing the required program.

Sl.No. Single Purpose Processor General Purpose Processor

3. They do not have program memory They have program memory

4.Do not have any flexibility and contain resources

required only for that particular functionality

Has a very large amount of resource which may or

may not be used for a particular functionality as

decided by the user

5. Merits include : They are fast, they consume low

power, they have small size and the unit cost may

be low for large quantities

Merits include : They have high Flexibility, Low NRE

costs, Low time to market, Performance may be for

fast and high-intensive computations.

Q3. Explain the three Processor Technologies Briefly

1. General Purpose Processors Software:

They are programmable devices used in a variety of applications. They are also known as microprocessors.

They have a program memory and a general data path with a large register file and a general ALU. The

data path must be large enough to handle a variety of computations. The programmer writes the program

to carry out the required functionality in the program memory and uses the features (instructions)

provided by the general data path. This is called as the software portion of the system. The benefits of

such a processor are very high. They require Low time-to-market and have low NRE costs. They provide a

high flexibility.


This is a digital circuit designed to execute exactly one program. Its features are, it contains only the

components needed to execute a single program; it contains no program memory. User cannot change

the functionality of the chip. They are fast, low powered and small sized.

3. Application Specific Processors:Application specific Instruction set processors (ASIP)


27/270


ECE, SJBIT

2

They are programmable processors optimized for a particular class of applications having common

characteristics. They strike a compromise between general-purpose and single-purpose processors. They

have a program memory, an optimized data path and special functional units. They have good

performance, some flexibility, size and power.

4. What are the common design metrics that a design engineer should

consider?

- NRE( non recurring engineering Cost) : The one time monetary cost of designing the system.

- Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.

- Size: physical space required by the system. Often measured in terms of bytes in case of software, and no.

of gates in terms of hardware.

- Performance: execution/response time of the system.

- Power: The amount of power consumed by the system, which may define lifetime of the battery and

cooling requirement of IC. More power means more heat.

- Flexibility: ability to change the functionality of the system.

- Time to prototype: time needed to build a working system w/o incurring heavy NRE.

- Time to market: time required to develop & released to the market.

- Maintainability: ability to modify the system after its release to the market.

- Correctness: our confidence that we have implemented systems functionality correctly.

- Safety: probability that the system does not cause any harm.

Metrics typically compete with one another: improving one often leads to worsening of another

Q5. Write short notes on IC technology

Every processor must eventually be implemented on an IC. IC technology involves the manner in which

we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit), often called a chip,

is a semiconductor device consisting of a set of connected transistors and other devices. A number of

different processes exist to build semiconductors, the most popular of which is CMOS (Complementary

Metal Oxide Semiconductor). The IC technologies differ by how customized the IC is for a particular

implementation. IC technology is independent from processor technology; any type of processor can be

mapped to any type of IC technology.


28/270


ECE, SJBIT

2

The independence of processor and IC technologies: any processor technology can be

mapped to any IC technology.

To understand the differences among IC technologies, we must first recognize that semiconductors

consist of numerous layers. The bottom layers form the transistors. The middle layers form logic gates.

The top layers connect these gates with wires. One way to create these layers is by depositing photo-sensitive chemicals on the chip surface and then shining light through masks to change regions of the

chemicals. Thus, the task of building the layers is actually one of designing appropriate masks. A set of

masks is often called a layout. The narrowest line that we can create on a chip is called thefeature size,

which today is well below one micrometer (sub-micron). For each IC technology, all layers must

eventually be built to get a working IC; the question is who builds each layer and when.

Q6. Derive the equation for percentage loss for any market rise . A

product was delayed by 4 weeks in releasing to market. The peak

revenue for on time entry to market would occur after 20 weeks for amarket rise angle by 45. Find the percentage revenue loss.

Ans : Lets investigate the loss of revenue that can occur due to delayed entry of a product in the

market. We can use a simple triangle model y axis is the market rise, x axis to represent the point of

entry to the market. The revenue for an on time market entry is the area of the triangle labeled on

time and the revenue for a delayed entry product is the area of the triangle labeled Delayed. The

revenue loss for a delayed entry is the difference of these triangles areas.

% revenue loss = ((on time Delayed)/on time)*100 %

The area of on time triangle = * base * height

W -- height the market raise

D -- Delayed entry ( in terms of weeks or months )

2Wproducts life time


29/270


ECE, SJBIT

2

Area of on time triangle = *2W*W

Area of delayed triangle=1/2*(W-D+W)*(W-D)

%age revenue loss = (D (3W- D)/2W*W) * 100 %

Ex: products life time is 52 weeks

Delay of entry to the market is 4 weeks

Percentage revenue loss = 22%

Q7. Compare GPP,SPP and ASSP along with their block diagrams .1. General Purpose Processors Software

They are programmable devices used in a variety of applications. They are also known as

microprocessors. They have a program memory and a general data path with a large register file andgeneral ALU. The data path must be large enough to handle a variety of computations. The

programmer writes the program to carry out the required functionality in the program memory

and uses the features (instructions) provided by the general data path. This is called as the

software portion of the system. The benefits of such a processor are very high. They require Low

time-to-market and have low NRE costs. They provide a high flexibility.

Design time and NRE costare low, because the designer must only write a program, but need not do any

digital design. Flexibilityis high, because changing functionality requires only changing the program. Unit

costmay be relatively low in small quantities, since the processor manufacturer sells large quantities to

other customers and hence distributes the NRE cost over many units. Performance may be fast for

computation-intensive applications, if using a fast processor, due to advanced architecture features and

leading edge IC technology.

some design-metric drawbacks : Unit costmay be too high for large quantities. Performance may be

slow for certain applications. Size andpowermay be large due to unnecessary processor hardware.

Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,

representing an exact fit of the desired functionality, nothing more, nothing less.


30/270


ECE, SJBIT

3

Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-

purpose processor, (b) application-specific processor, (c)single-purpose processor.

Fig 1.5 Implementing desired functionality on different General purpose processor


This is a digital circuit designed to execute exactly one program. Its features are, it contains only the

components needed to execute a single program; it contains no program memory. User cannot change

the functionality of the chip. They are fast, low powered and small sized.

An embedded system designer creates a single-purpose processor by designing a custom digital circuit.Using a single-purpose processor in an embedded system results in several design metric benefits and

drawbacks, which are essentially the inverse of those for general purpose processors. Performance may

be fast, size and power may be small, and unit-cost may be low for large quantities, while design time

and NRE costs may be high, flexibility is low, unit cost may be high for small quantities, and performance

may not match general-purpose processors for some applications.


31/270


ECE, SJBIT

3

Fig 1.6 Implementing desired functionality on different single purpose processor

3.Application Specific Processors:Application specific Instruction set processors (ASIP):

They are programmable processors optimized for a particular class of applications having common

characteristics. They strike a compromise between general-purpose and single-purpose processors. They

have a program memory, an optimized data path and special functional units. They have good

performance, some flexibility, size and power.

An application-specific instruction-set processor (or ASIP) can serve as a compromise between the above

processor options. An ASIP is designed for a particular class of applications with common characteristics,

such as digital-signal processing, telecommunications, embedded control, etc. The designer of such a

processor can optimize the datapath for the application class, perhaps adding special functional units for

common operations, and eliminating other infrequently used units.


32/270


ECE, SJBIT

3

Fig 1.7 Implementing desired functionality on different Application Specific processor

Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. A DSP is a

processor designed to perform common operations on digital signals, which are the digital encodings of

analog signals like video and audio. These operations carry out common signal processing tasks likesignal filtering, transformation,or combination. Such operations are usually math-intensive, including

operations like multiply and add or shift and add. To support such operations, a DSP may have special

purpose datapath components such a multiply-accumulate unit, which can perform a computation like T

= T + M[i]*k using only one instruction. Because DSP programs often manipulate large arrays of data, a

DSP may also include special hardware to fetch sequential data memory locations in parallel with other

operations, to further speed execution.

Q8. Suggest two methods to improve productivity.

There are numerous additional approaches to improving designer productivity. Standards focus on

developing well-defined methods for specification, synthesis and libraries. Such standards can reduce

the problems that arise when a designer uses multiple tools, or retrieves or provides design information

from or to other designers. Common standards include language standards, synthesis standards and

library standards.

Languages focus on capturing desired functionality with minimum designer effort. For example, the

sequential programming language of C is giving way to the object oriented language of C++, which in

turn has given some ground to Java. As another example, state-machine languages permit direct capture

of functionality as a set of states and transitions, which can then be translated to other languages l ike C.

Frameworks provide a software environment for the application of numerous tools throughout the

design process and management of versions of implementations. For example, a framework might

generate the UNIX directories needed for various simulators and synthesis tools, supporting application

of those tools through menu selections in a single graphical user interface.


33/270


ECE, SJBIT

3

UNIT 2

SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT

level Combinational and Sequential Components, Optimizing single-purpose processors. Single-

Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development

Environment, ASIPS.

6 Hours

TEXT BOOKS:

1. Embedded System Design: A Unified Hardware/Software Introduction - Frank

Vahid, Tony Givargis, John Wiley & Sons, Inc.2002

REFERENCE BOOKS:

1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008

2. Embedded Systems Architecture A Comprehensive Guide for Engineers and

Programmers, Tammy Noergaard, Elsevier Publication, 2005

3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).


34/270


ECE, SJBIT

3

UNIT 2

CUSTOM SINGLE PURPOSE PROCESSORS: HARDWARE

2.1 INTRODUCITON:

A processor is a digital circuit designed to perform computation tasks . a processor consists of adatapath capable of storing and manipulating data and a controller capable of moving datathrough the datapath.

A general purpose processor is designed to carry out a wide variety of computation task.A singlepurpose processor is designed specifically to carry out a particular computational task.

A custom single-purpose processor may be

Fast, small, low power But, high NRE, longer time-to-market, less flexible

2.2 COMBINATIONAL LOGIC:

1. Transistors and Logic Gates

2. Basic combinational logic design

3. RT level combinational components

Transistors and Logic Gates:

A transistor is the basic electrical component in digital systems. A transistor acts as

simple on/off switch. Among the designs CMOS is one .

Fig 2.1 view of CMOS transistor on silicon

The CMOS transistor consists of Gate, source and drain , where gate controls the current

flow from source to drain. The voltage of +3V or +5V can be supply which will refer to

logic 1 and low voltage is typically ground and treated as logic 0.


35/270


ECE, SJBIT

3

When logic 1 is applied to gate transistor conducts so current flows

When logic 0 is applied to gate transistor does not conduct.

Fig 2.2 a & b CMOS transistor implementation

Fig 2.2 a b & c CMOS transistor implementation of inverter,NAND and NOR gate

Digital system designers work at the abstraction level of logic gates where each gate is

represented symbolically with Boolean equation as shown in figure 2.3


36/270


ECE, SJBIT

3

Fig 2.3 Basic logic gates

Combinational logic design:

A combinational circuit is a digital circuit whose output is purely a function of its

present inputs. Such a circuit has no memory of past inputs.example is shown below.

Fig 2.4 combi design : problem , TT, output , minimized , final ckt.


37/270


ECE, SJBIT

3

RT level combinational components:

Design of complex digital circuits takes time using only logic gates , so, combinational

components like Mux, Decoders,adders ,comparators, ALUetc can be designed used RT

level synthesis .

Fig 2.5 combinational components

2.3 Sequential logic

a.Flip flops

b.RT level sequential components

c. Sequential logic design

2.3.1 Flip flops

A sequential circuit is a digital circuit whose outputs are a function of the present as well

as previous input values. Basic sequential circuits is a flip flop. A flip flop stores a single

bit.


38/270


ECE, SJBIT

3

D-flip flop: It has two inputs D and clock, when clock is 1, value of D is stored in flip

flop and output Q occurs. When clock is 0, previously stored bit is maintained and

output appears at Q.

SR Flip flop : It hasthree inputs S,R,clock , when clock is 1, inputs S and R are examined

, if S is 1 ,1 is stored. If R is 1, 0 is stored. If both S and R is 0, there is no change. If both

are 1 behavior is undefined. Thus S stands for set and R for reset.

Fig 2.6 Sequential components

2.3.2 RT level sequential components:

A register , shift register and counters are designed using RT level synthesis, In

which , a register stores n bits from its n-bit data input I with those stored bits

appearing at its output Q and bits are stored in parallel.

A shift register stores n bits, but these bits cannot be stored in parallel , instead

they are shifted into the registers serially . A shift register has one data input I

and two control inputs clock and shift.

A counter is a register that can also increment add one binary bit to its stored

binary value. A synchronous input value only has an effect during a clock edge. Anasynchronous inputs value affects the circuit independent of the clock. All these

are shown in figure 2.6


39/270


ECE, SJBIT

3

2.3.3 Sequential logic design

Sequential logic design can be achieved using a straight forward technique

which is illustrated below

Fig 2.7 (a) (b)( c)( d) sequential logic design

Fig 2.7 (e) (f) sequential logic design


40/270


ECE, SJBIT

4

2.4 Custom single purpose processor design:

A basic processor consists of a controller and a data path . The datapath stores and

manipulates a systems data controller carries out the configuration of the datapath

and sets the datapath control inputs like register load mux select signals functional unitsand connection units to obtain desired configuration of the datapath.

Fig 2.8 A basic processor(a) controller and datapath

(b) view inside the controller and datapath

Example program :

First create algorithm

Convert algorithm to complex state machine

Known as FSMD: finite-state machine with datapath Can use templates to perform such conversion


41/270


ECE, SJBIT

4

Fig : 2.9 Example program GCD

Create a register for any declared variable

Create a functional unit for each arithmetic operation

Connect the ports, registers and functional units

Based on reads and writes

Use multiplexors for multiple sources

Create unique identifier

for each datapath component control input and output

Templates for creating state diagram :

We finished the datapath

We have a state table for the next state and control logic

All thats left is combinational logic design This is notan optimized design, but we see the basic steps


42/270


ECE, SJBIT

4

Fig 2.10 : Templates for creating state diagram

2.5 RT level Custom Single Purpose processor Design:

We often start with a state machine

Rather than algorithm

Cycle timing often too central to functionality

Example

Bus bridge that converts 4-bit bus to 8-bit bus

Start with FSMD

Known as register-transfer (RT) level Exercise: complete the design


43/270


ECE, SJBIT

4

Fig 2.13 RT level Custom Single Purpose processor Design example

2.6 Optimizing Custom single-purpose processors

Optimization is the task of making design metric values the best

possible

Optimization opportunities

original program

FSMD datapath

FSM


44/270


ECE, SJBIT

4

Optimizing the original program

Analyze program attributes and look for areas of possible

improvement

number of computations size of variable

time and space complexity

operations used

multiplication and division very expensive

Fig 2.15 optimizing the program

Optimizing the FSMD:


45/270


ECE, SJBIT

4

Areas of possible improvements

merge states

states with constants on transitions can be eliminated,

transition taken is already known states with independent operations can be merged

separate states

states which require complex operations (a*b*c*d) can be

broken into smaller states to reduce hardware size

scheduling

Fig 2.16 optimizing the FSDM for GCD Optimizing the datapath:


46/270


ECE, SJBIT

4

Sharing of functional units

one-to-one mapping, as done previously, is not necessary

if same operation occurs in different states, they can share a single

functional unit

Multi-functional units ALUs support a variety of operations, it can be shared among

operations occurring in different states

Optimizing the FSM:

State encoding

task of assigning a unique bit pattern to each state in an FSM

size of state register and combinational logic vary

can be treated as an ordering problem

State minimization

task of merging equivalent states into a single state

state equivalent if for all possible input combinations the

two states generate the same outputs and transitions to

the next same state


47/270


ECE, SJBIT

4

GENENRAL PURPOSE PROCESSORS : SOFTWARE

A General-Purpose Processor is a

Processor designed for a variety of computation tasks

Low unit cost, in part because manufacturer spreads NRE overlarge numbers of units

Motorola sold half a billion 68HC05 microcontrollers in

1996 alone

Carefully designed since higher NRE is acceptable

Can yield good performance, size and power

Low NRE cost, short time-to-market/prototype, high flexibility

User just writes software; no processor design

a.k.a. microprocessor micro used when they wereimplemented on one or a few chips rather than entire rooms

Basic Architecture:

A general purpose processor sometimes called a CPU consists of datapath

and a control unit linked with memory.

Control unit and datapath

Note similarity to single-purpose processor

Key differences

Datapath is general

Control unit doesnt store the algorithm the algorithm is

programmed into the memory

Datapath Operations:

Load

Read memory location into register

ALU operation

Input certain registers through ALU, store back in register

Store

Write register to memory location


48/270


ECE, SJBIT

4

Fig 2.17 GPP basic architecture

Control unit :

Control unit: configures the datapath operations

Sequence of desired operations (instructions) stored in memory

program

Instruction cycle broken into several sub-operations, each one clock cycle, e.g.:

Fetch: Get next instruction into IR

Decode: Determine what the instruction means

Fetch operands: Move data from memory to datapath register Execute: Move data through the ALU

Store results: Write data from register to memory


49/270


ECE, SJBIT

4

Control Unit Sub-Operations:

Fetch

Get next instruction into IR

PC: program counter, always points to next instruction IR: holds the fetched instruction

Decode

Determine what the instruction means

Fetch operands

Move data from memory to datapath register

Execute

Move data through the ALU

This particular instruction does nothing during this sub-operation

Store results Write data from register to memory

This particular instruction does nothing during this sub-operation

Memory:

Program information consists of the sequence of instructions that cause the processor

to carry out the desired system functionality. Data information represents the values

being input, output and transformed by the program. We can store program and data

together or separately..

In a Princeton architecture,data and program words share the same memory space. The

Princeton architecture may result in a simpler hardware connection to memory, since

only one connection is necessary.

In a Harvard architecture, the program memory space is distinct from the data memory

space. A Harvard architecture,while requiring two connections, can perform instruction

and data fetches simultaneously, so may result in improved performance.

Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvardarchitecture.


50/270


ECE, SJBIT

5

Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton

Memory may be read-only memory (ROM) or readable and writable memory

(RAM). ROM is usually much more compact than RAM. An embedded system often uses

ROM for program memory, since, unlike in desktop systems, an embedded systems

program does not change. Constant-data may be stored in ROM, but other data of

course requires RAM.

Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the

processor, while off-chip memory resides on a separate IC. The processor can usually

access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but

finite IC capacity of course implies only a limited amount of on-chip memory.

Figure 2.20: Cache memory

To reduce the time needed to access (read or write) memory, a local copy of a portion

of memory may be kept in a small but especially fast memory called cache. Cache


51/270


ECE, SJBIT

5

memory often resides on-chip, and often uses fast but expensive static RAM technology

rather than slower but cheaper dynamic RAM. Cache memory is based on the principle

that if at a particular time a processor accesses a particular memory location, then the

processor will likely access that location and immediate neighbors of the location in the

near future.

Operation:

Instruction execution:

1. Fetch instruction: the task of reading the next instruction from memory into

the instruction register.

2. Decode instruction: the task of determining what operation the instruction

in the instruction register represents (e.g., add, move, etc.).

3. Fetch operands: the task of moving the instructions operand data intoappropriate registers.

4. Execute operation: the task of feeding the appropriate registers through the

ALU and back into an appropriate register.

5. Store results: the task of writing a register into memory.

If each stage takes one clock cycle, then we can see that a single instruction may take

several cycles to complete.

Pipelining

Pipelining is a common way to increase the instruction throughput of a microprocessor.

We first make a simple analogy of two people approaching the chore of washing and

drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the

second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach

requires 16 minutes. The approach is clearly inefficient since at any time only one

person is working and the other is idle. Obviously, a better approach is for the second

person to begin drying the first dish immediately after it has been washed. This

approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8

more minutes until the last dish is finally dry . We refer to this latter approach as

pipelined.


52/270


ECE, SJBIT

5

Figure 2.21: Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,

(c) pipelined instruction execution.

Each dish is like an instruction, and the two tasks of washing and drying are like the five

stages listed above. By using a separate unit (each akin a person) for each stage, we can

pipeline instruction execution. After the instruction fetch unit etches the first

instruction, the decode unit decodes it while the instruction fetch unit simultaneously

fetches the next instruction.

Superscalar and VLIW Architectures:

Performance can be improved by:

Faster clock (but theres a limit)

Pipelining: slice up instruction into stages, overlap stages

Multiple ALUs to support more than one instruction stream

Superscalar

Scalar: non-vector operations

Fetches instructions in batches, executes as many as

possible

May require extensive hardware to detect

independent instructions

VLIW: each word in memory has multiple independent

instructions


53/270


ECE, SJBIT

5

Relies on the compiler to detect and schedule

instructions

Currently growing in popularity

Programmers View

Programmer doesnt need detailed understanding of architecture

Instead, needs to know what instructions can be executed

Two levels of instructions:

Assembly level

Structured languages (C, C++, Java, etc.)

Most development today done using structured languages But, some assembly level programming may still be necessary

Drivers: portion of program that communicates with and/or controls

(drives) another device

Often have detailed timing considerations, extensive bit

manipulation

Assembly level may be best for these

Fig 2.22 Instruction stored in memory


54/270


ECE, SJBIT

5

Instruction Set:

Defines the legal set of instructions for that processor

Data transfer: memory/register, register/register, I/O, etc.

Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1

Addressing Modes:

Fig 2.23 Addressing modes


55/270


ECE, SJBIT

5

Fig 2.24 A Simple (Trivial) Instruction Set

Program and data memory spaceThe embedded systems programmer must be aware of the size of the available memory

for program and for data. The programmer must not exceed these limits. In addition,

the programmer will probably want to be aware of on-chip program and data memory

capacity, taking care to fit the necessary program and data in on-chip memory if

possible.

RegistersThe assembly-language programmer must know how many registers are available for

general-purpose data storage. For example, a base register may exist, which permits the

programmer to use a data-transfer instruction where the processor adds an operand

field to the base register to obtain an actual memory address.

I/OThe programmer should be aware of the processors input and output (I/O) facilities,

with which the processor communicates with other devices. One common I/O facility is

parallel I/O, in which the programmer can read or write a port (a collection of external

pins) by reading or writing a special-function register. Another common I/O facility is a

system bus, consisting of address and data ports that are automatically activated by


56/270


ECE, SJBIT

5

certain addresses or types of instructions.

Interrupts

An interrupt causes the processor to suspend execution of the main program, andinstead jump to an Interrupt Service Routine (ISR) that fulfills a special, short-term

processing need. In particular, the processor stores the current PC, and sets it to the

address of the ISR. After the ISR completes, the processor resumes execution of the

main program by restoring the PC.The programmer should be aware of the types of

interrupts supported by the processor (we describe several types in a subsequent

chapter), and must write ISRs when necessary. The assembly-language programmer

places each ISR at a specific address in program memory. The structured-language

programmer must do so also; some compilers allow a programmer to force a procedure

to start at a particular memory location, while recognize pre-defined names for

particular ISRs.For example, we may need to record the occurrence of an event from a peripheral

device, such as the pressing of a button. We record the event by setting a variable in

memory when that event occurs, although the users main program may not process

that event until later. Rather than requiring the user to insert checks for the event

throughout the main program, the programmer merely need write an interrupt service

routine and associate it with an input pin connected to the button. The processor will

then call the routine automatically when the button is pressed.

Operating System

Optional software layer providing low-level services to a program (application).

File management, disk access

Keyboard/display interfacing

Scheduling multiple programs for execution

Or even just multiple threads from one program

Program makes system calls to the OS

Development Environment

Development processor

The processor on which we write and debug our programs

Usually a PC

Target processor


57/270


ECE, SJBIT

5

The processor that the program will run on in our embedded system

Often different from the development processor

Software Development Process

Compilers Cross compiler

Runs on one processor, but generates code for another

Assemblers

Linkers

Debuggers

Profilers

Fig 2.25 Software Development Process


58/270


ECE, SJBIT

5

Running a Program: If development processor is different than target, how can we run our compiled

code? Two options:

Download to target processor

Simulate

Simulation

One method: Hardware description language

But slow, not always available

Another method: Instruction set simulator (ISS)

Runs on development processor, but executes instructions of target

processor

Testing and Debugging: ISS

Gives us control over time set breakpoints, look at register values, setvalues, step-by-step execution, ...

But, doesnt interact with real environment

Download to board

Use device programmer

Runs in real environment, but not controllable

Compromise: emulator

Runs in real environment, at speed or near

Supports some controllability from the PC

Fig 2.26 software design process


59/270


ECE, SJBIT

5

Application-Specific Instruction-Set Processors (ASIPs):

General-purpose processors

Sometimes too general to be effective in demanding application

e.g., video processing requires huge video buffers andoperations on large arrays of data, inefficient on a GPP

But single-purpose processor has high NRE, not programmable

ASIPs targeted to a particular domain

Contain architectural features specific to that domain

e.g., embedded control, digital signal processing, video

processing, network processing, telecommunications, etc.

Still programmable

A Common ASIP: Microcontroller

For embedded control applications

Reading sensors, setting actuators

Mostly dealing with events (bits): data is present, but not in huge

amounts

e.g., VCR, disk drive, digital camera (assuming SPP for image

compression), washing machine, microwave oven

Microcontroller features

On-chip peripherals Timers, analog-digital converters, serial communication, etc.

Tightly integrated for programmer, typically part of register

space

On-chip program and data memory

Direct programmer access to many of the chips pins

Specialized instructions for bit-manipulation and other low-level

operations

Digital Signal Processors (DSP)

For signal processing applications

Large amounts of digitized data, often streaming

Data transformations must be applied fast

e.g., cell-phone voice filter, digital TV, music synthesizer


60/270


ECE, SJBIT

6

DSP features

Several instruction execution units

Multiple-accumulate single-cycle instruction, other instrs.

Efficient vector operations e.g., add two arrays

Vector ALUs, loop buffers, etc.

Selecting a Microprocessor

Issues

Technical: speed, power, size, cost

Other: development environment, prior expertise, licensing, etc.

Speed: how evaluate a processors speed?

Clock speed but instructions per cycle may differ Instructions per second but work per instr. may differ

Dhrystone: Synthetic benchmark, developed in 1984.

Dhrystones/sec.

MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digitals

VAX 11/780). A.k.a. Dhrystone MIPS. Commonly used today.

So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per

second

SPEC: set of more realistic benchmarks, but oriented to desktops

EEMBC EDN Embedded Benchmark Consortium, Suites of benchmarks: automotive, consumer electronics,

networking, office automation, telecommunications

Designing a General Purpose Processor Not something an embedded system designer normally would do

But instructive to see how simply we can build one top down

Remember that real processors arent usually built this way Much more optimized, much more bottom-up design


61/270


ECE, SJBIT

6

Fig:2.27 A simple microprocessor


62/270


ECE, SJBIT

6

RECOMMENDED QUESTIONS

UNIT 2

( Hardware)

1. What is single purpose processor? What are the benefits of choosing a

single purpose processor over a general purpose processor.?

2. How do nMOS and pMOS transistors differ?

3. Build a 3-input NAND gate using a minimum number of CMOS transistors.

4. Build a 3-input NOR gate using a minimum number of CMOS transistors.

5. Build a 2-input AND gate using a minimum number of CMOS transistors.

6. Build a 2-input OR gate using a minimum number of CMOS transistors.

7. Explain why NAND and NOR gates are more common than AND and OR

gates.

8. Distinguish between combinational and sequential circuit.

9. Design a 2-bit comparator with single output less than using

combinational design technique.

10.Design a 3 X 8 decoder with truth table and K-maps.

11.What is the difference between synchronous and asynchronous circuit?

12.What is the purpose of datapath and control path?

13.Design a single purpose processor that outputs Fibonacci numbers upto nplaces. Start with a function computing the desired result, translate it into

state diagram and sketch a probable datapath.

UNIT 2

( Software)1. Describe why a general purpose processor could cost less than a single

purpose processor.

2. Create a table listing the address spaces for 8 ,16, 24,32, 64 bit address

sizes.

3. Illustrate how program and data memory fetches can be overlapped in a

Harvard architecture.

4. For a microcontroller create a table listing Five existing variations stressing

the features that differ from the basic version.


63/270


ECE, SJBIT

6

QUESTION PAPER SOLUTION

UNIT 2

Q1. Write an algorithm for GCD with more time complexity and write theFSDM and also determine total number of steps required for GCD.

First create algorithm

Convert algorithm to complex state machine

Known as FSMD: finite-state machine with datapath

Can use templates to perform such conversion

GCD

Create a register for any declared variable

Create a functional unit for each arithmetic operation

Connect the ports, registers and functional units

Based on reads and writes

Use multiplexors for multiple sources

Create unique identifier


64/270


ECE, SJBIT

6

for each datapath component control input and output

Templates for creating state diagram :

We finished the datapath

We have a state table for the next state and control logic

All thats left is combinational logic design

This is notan optimized design, but we see the basic steps

Templates for creating state diagram

Q2. Explain the different methods to optimize the FSDM .

Optimization is the task of making design metric values the best

possible

Optimization opportunities

original program


65/270


ECE, SJBIT

6

FSMD

datapath

FSM

Optimizing the original program

Analyze program attributes and look for areas of possible

improvement

number of computations

size of variable

time and space complexity operations used

multiplication and division very expensive

Q3. Explain the different memory architecturesProgram information consists of the sequence of instructions that cause the processor

to carry out the desired system functionality. Data information represents the values

being input, output and transformed by the program. We can store program and data

together or separately..

In a Princeton architecture,data and program words share the same memory space. The

Princeton architecture may result in a simpler hardware connection to memory, since

only one connection is necessary.

In a Harvard architecture, the program memory space is distinct from the data memory

space. A Harvard architecture,while requiring two connections, can perform instruction

and data fetches simultaneously, so may result in improved performance.

Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvard

architecture.


66/270


ECE, SJBIT

6

Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton

Memory may be read-only memory (ROM) or readable and writable memory

(RAM). ROM is usually much more compact than RAM. An embedded system often uses

ROM for program memory, since, unlike in desktop systems, an embedded systems

program does not change. Constant-data may be stored in ROM, but other data of

course requires RAM.

Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the

processor, while off-chip memory resides on a separate IC. The processor can usually

access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but

finite IC capacity of course implies only a limited amount of on-chip memory.

Q4. Explain pipelining for instruction execution with dish cleaning.

Pipelining is a common way to increase the instruction throughput of a microprocessor.

We first make a simple analogy of two people approaching the chore of washing and

drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the

second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach

requires 16 minutes. The approach is clearly inefficient since at any time only one

person is working and the other is idle. Obviously, a better approach is for the second

person to begin drying the first dish immediately after it has been washed. This

approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8

more minutes until the last dish is finally dry .


67/270


ECE, SJBIT

6

: Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,

(c) pipelined instruction execution.

Each dish is like an instruction, and the two tasks of washing and drying are like the five

stages listed above. By using a separate unit (each akin a person) for each stage, we can

pipeline instruction execution. After the instruction fetch unit etches the first

instruction, the decode unit decodes it while the instruction fetch unit simultaneously

fetches the next instruction.

Q5. Explain the software development process.

Software Development Process Compilers

Cross compiler

Runs on one processor, but generates code for another

Assemblers

Linkers

Debuggers

Profilers


68/270


ECE, SJBIT

6

Fig 2.25 Software Development Process

Running a Program: If development processor is different than target, how can we run our compiled

code? Two options:

Download to target processor

Simulate

Simulation

One method: Hardware description language

But slow, not always available

Another method: Instruction set simulator (ISS)

Runs on development processor, but executes instructions of target

processor

Testing and Debugging: ISS Gives us control over time set breakpoints, look at register values, set

values, step-by-step execution, ...

But, doesnt interact with real environment

Download to board

Use device programmer


69/270


ECE, SJBIT

6

Runs in real environment, but not controllable

Compromise: emulator

Runs in real environment, at speed or near

Supports some controllability from the PC

software design process


70/270


ECE, SJBIT

7

optimizing the program

Optimizing the FSMD:

Areas of possible improvements

merge states

states with constants on transitions can be eliminated,

transition taken is already known states with independent operations can be merged

separate states

states which require complex operations (a*b*c*d) can be

broken into smaller states to reduce hardware size

scheduling


71/270


ECE, SJBIT

7

optimizing the FSDM for GCD Optimizing the datapath:

Sharing of functional units

one-to-one mapping, as done previously, is not necessary

if same operation occurs in different states, they can share a single

functional unit

Multi-functional units

ALUs support a variety of operations, it can be shared amongoperations occurring in different states

Optimizing the FSM:

State encoding


72/270

Embedded System Desi

Ece Viii Embedded System Design [06ec82] Notes

Documents

Transcript of Ece Viii Embedded System Design [06ec82] Notes