Download - The Pentium Processor. 2003 To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003. S. Dandamudi Chapter 7:

The Pentium Processor

2003

To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.

S. Dandamudi Chapter 7:

Pentium Family

• Intel introduced microprocessors in 1969 4-bit microprocessor 4004 8-bit microprocessors

» 8080» 8085

16-bit processors» 8086 introduced in 1979

– 20-bit address bus, 16-bit data bus» 8088 is a less expensive version

– Uses 8-bit data bus» Can address up to 4 segments of 64 KB» Referred to as the real mode

2003



Pentium Family (cont’d)

80186» A faster version of 8086

» 16-bit data bus and 20-bit address bus

» Improved instruction set

80286 was introduced in 1982» 24-bit address bus

» 16 MB address space

» Enhanced with memory protection capabilities

» Introduced protected mode

– Segmentation in protected mode is different from the real mode

» Backwards compatible

2003




80386 was introduced 1985» First 32-bit processor

» 32-bit data bus and 32-bit address bus

» 4 GB address space

» Segmentation can be turned off (flat model)

» Introduced paging

80486 was introduced 1989» Improved version of 386

» Combined coprocessor functions for performing floating-point arithmetic

» Added parallel execution capability to instruction decode and execution units

– Achieves scalar execution of 1 instruction/clock

» Later versions introduced energy savings for laptops

2003




Pentium (80586) was introduced in 1993» Similar to 486 but with 64-bit data bus

» Wider internal datapaths

– 128- and 256-bit wide

» Added second execution pipeline

– Superscalar performance

– Two instructions/clock

» Doubled on-chip L1 cache

– 8 KB data

– 8 KB instruction

» Added branch prediction

2003




Pentium Pro was introduced in 1995» Three-way superscalar

– 3 instructions/clock

» 36-bit address bus

– 64 GB address space

» Introduced dynamic execution

– Out-of-order execution

– Speculative execution

» In addition to the L1 cache

– Has 256 KB L2 cache

2003




Pentium II was introduced in 1997» Introduced multimedia (MMX) instructions

» Doubled on-chip L1 cache

– 16 KB data

– 16 KB instruction

» Introduced comprehensive power management features

– Sleep

– Deep sleep

» In addition to the L1 cache

– Has 256 KB L2 cache

Pentium III, Pentium IV,…

2003




Itanium processor» RISC design

– Previous designs were CISC

» 64-bit processor

» Uses 64-bit address bus

» 128-bit data bus

» Introduced several advanced features

– Speculative execution

– Predication to eliminate branches

– Branch prediction

2003



Pentium Processor

2003



Pentium Processor (cont’d)

• Data bus (D0 – D 63) 64-bit data bus

• Address bus (A3 – A31) Only 29 lines

» No A0-A2 (due to 8-byte wide data bus)

• Byte enable (BE0# - BE7#) Identifies the set of bytes to read or write

» BE0# : least significant byte (D0 – D7)

» BE1# : next byte (D8 – D15)

» …

» BE7# : most significant byte (D56 – D63)

Any combination of bytes can be specified

2003




• Data parity (DP0 – DP7) Even parity for 8 bytes of data

» DP0 : D0 – D7

» DP1 : D8 – D15

» …

» DP7 : D56 – D63

• Parity check (PCHK#) Indicates the parity check result on data read Parity is checked only for valid bytes

» Indicated by BE# signals

2003




• Parity enable (PEN#) Determines whether parity check should be used

• Address parity (AP) Bad address parity during inquire cycles

• Memory/IO (M/IO#) Defines bus cycle: memory or I/O

• Write/Read (W/R#) Distinguishes between write and read cycles

• Data/Code (D/C#) Distinguishes between data and code

2003




• Cacheability (CACHE#) Read cycle: indicates internal cacheability Write cycle: burst write-back

• Bus lock (LOCK#) Used in read-modify-write cycle Useful in implementing semaphores

• Interrupt (INTR) External interrupt signal

• Nonmaskable interrupt (NMI) External NMI signal

2003




• Clock (CLK) System clock signal

• Bus ready (BRDY#) Used to extend the bus cycle

» Introduces wait states

• Bus request (BREQ) Used in bus arbitration

• Backoff (BOFF#) Aborts all pending bus cycles and floats the bus Useful to resolve deadlock between two bus masters

2003




• Bus hold (HOLD) Completes outstanding bus cycles and floats bus Asserts HLDA to give control of bus to another master

• Bus hold acknowledge (HLDA) Indicates the Pentium has given control to another local

master Pentium continues execution from its internal caches

• Cache enable (KEN#) If asserted, the current cycle is transformed into cache

line fill

2003




• Write-back/Write-through (WB/WT#) Determines the cache write policy to be used

• Reset (RESET) Resets the processor Starts execution at FFFFFFF0H Invalidates all internal caches

• Initialization (INIT) Similar to RESET but internal caches and FP registers

are not flushed After powerup, use RESET (not INIT)

2003



Pentium Registers

• Four 32-bit registers can be used as Four 32-bit register (EAX, EBX, ECX, EDX) Four 16-bit register (AX, BX, CX, DX) Eight 8-bit register (AH, AL, BH, BL, CH, CL, DH, DL)

• Some registers have special use ECX for count in loop instructions

2003



Pentium Registers (cont’d)

• Two index registers 16- or 32-bit registers Used in string instructions

» Source (SI) and destination (DI)

Can be used as general-purpose data registers

• Two pointer registers 16- or 32-bit registers Used exclusively to

maintain the stack

2003




2003




• Control registers (E)IP

» Program counter

(E) FLAGS» Status flags

– Record status information about the result of the last arithmetic/logical instruction

» Direction flag

– Forward/backward direction for data copy

» System flags

– IF : interrupt enable

– TF : Trap flag (useful in single-stepping)

2003




• Segment register Six 16-bit registers Support segmented memory

architecture At any time, only six

segments are accessible Segments contain distinct

contents» Code

» Data

» Stack

2003



Real Mode Architecture

• Pentium supports two modes Real mode

» Uses 16-bit addresses

» Runs 8086 programs

» Pentium acts as a faster 8086

Protected mode» 32-bit mode

» Native mode of Pentium

» Supports segmentation and paging

2003



Real Mode Architecture (cont’d)

• Segmented organization 16-bit wide segments Two components

» Base (16 bits)

» Offset (16 bits)

• Two-component specification is called logical address Also called effective

address

• 20-bit physical address

2003




• Conversion from logical to physical addresses

11000 (add 0 to base)

+ 450 (offset)

11450 (physical address)

2003




Two logical addresses map to the same physical address

2003




• Programs can access up to six segments at any time

• Two of these are for Data Code

• Another segment is typically used for Stack

• Other segments can be used for data, code,..

2003




2003



Protected Mode Architecture

• Supports sophisticated segmentation

• Segment unit translates 32-bit logical address to 32-bit linear address

• Paging unit translates 32-bit linear address to 32-bit physical address If no paging is used

» Linear address = physical address

2003



Protected Mode Architecture (cont’d)

Address translation

2003




• Index Selects a descriptor from one of two descriptor tables

» Local» Global

• Table Indicator (TI) Select the descriptor table to be used

» 0 = Local descriptor table » 1 = Global descriptor table

• Requestor Privilege Level (RPL) Privilege level to provide protected access to data

» Smaller the RPL, higher the privilege level

2003




Visible part» Instructions to load segment selector

mov, pop, lds, les, lss, lgs, lfs Invisible

» Automatically loaded when the visible part is loaded from a descriptor table

2003




Segment descriptor

2003




• Base address 32-bit segment starting address

• Granularity (G) Indicates whether the segment size is in

» 0 = bytes, or

» 1 = 4KB

• Segment Limit 20-bit value specifies the segment size

» G = 0: 1byte to 1 MB

» G = 1: 4KB to 4GB, in increments of 4KB

2003




• D/B bit Code segment

» D bit: default size operands and offset value

– D = 0: 16-bit values

– D = 1: 32-bit values

Data segment» B bit: controls the size of the stack and stack pointer

– B = 0: SP is used with an upper bound of FFFFH

– B = 1: ESP is used with an upper bound of FFFFFFFFH

Cleared for real mode Set for protected mode

2003




• S bit Identifies whether

» System segment, or

» Application segment

• Descriptor privilege level (DPL) Defines segment privilege level

• Type Identifies type of segment

» Data segment: read-only, read-write, …

» Code segment: execute-only, execute/read-only, …

• P bit Indicates whether the segment is present

2003




• Three types of segment descriptor tables Global descriptor table (GDT)

» Only one in the system

» Contains OS code and data

» Available to all tasks

Local descriptor table (LDT)» Several LDTs

» Contains descriptors of a program

Interrupt descriptor table (IDT» Used in interrupt processing

» Details in Chapter 20

2003




• Segmentation Models Pentium can turn off segmentation Flat model

» Consists of one segment of 4GB

» E.g. used by UNIX

Multisegment model» Up to six active segments

» Can have more than six segments

– Descriptors must be in the descriptor table

» A segment becomes active by loading its descriptor into one of the segment registers

2003




2003



Mixed-Mode Operation

• Pentium allows mixed-mode operation Possible to combine 16-bit and 32-bit operands and

addresses D/B bit indicates the default size

» 0 = 16 bit mode

» 1 = 32-bit mode

Pentium provides two override prefixes» One for operands

» One for addresses

Details and examples in Chapter 11

2003



Default Segments

• Pentium uses default segments depending on the purpose of the memory reference Instruction fetch

» CS register

Stack operations» 16-bit mode: SP

» 32-bit mode: ESP

Accessing data» DS register

» Offset depends on the addressing mode

Last slide

Overview of Assembly Language

2003



Assembly Language Statements

• Three different classes Instructions

» Tell CPU what to do» Executable instructions with an op-code

Directives (or pseudo-ops)» Provide information to assembler on various aspects of the

assembly process» Non-executable

– Do not generate machine language instructions Macros

» A shorthand notation for a group of statements» A sophisticated text substitution mechanism with parameters

2003



Assembly Language Statements (cont’d)

• Assembly language statement format:

[label] mnemonic [operands] [;comment]

Typically one statement per line Fields in [ ] are optional label serves two distinct purposes:

» To label an instruction

– Can transfer program execution to the labeled instruction

» To label an identifier or constant

mnemonic identifies the operation (e.g., add, or) operands specify the data required by the operation

» Executable instructions can have zero to three operands

2003



Assembly Language Statements (cont’d)

comments» Begin with a semicolon (;) and extend to the end of the line

Examplesrepeat: inc result ; increment result

CR EQU 0DH ; carriage return character

• White space can be used to improve readabilityrepeat:

inc result

2003



Data Allocation

• Variable declaration in a high-level language such as C

char responseint valuefloat totaldouble average_value

specifies» Amount storage required (1 byte, 2 bytes, …)» Label to identify the storage allocated (response, value, …)» Interpretation of the bits stored (signed, floating point, …)

– Bit pattern 1000 1101 1011 1001 is interpreted as29,255 as a signed number 36,281 as an unsigned number

2003



Data Allocation (cont’d)

• In assembly language, we use the define directive Define directive can be used

» To reserve storage space

» To label the storage space

» To initialize

» But no interpretation is attached to the bits stored

– Interpretation is up to the program code

Define directive goes into the .DATA part of the assembly language program

• Define directive format

[var-name] D? init-value [,init-value],...

2003




• Five define directivesDB Define Byte ;allocates 1 byteDW Define Word ;allocates 2 bytesDD Define Doubleword ;allocates 4 bytesDQ Define Quadword ;allocates 8 bytesDT Define Ten bytes ;allocates 10 bytes

Examplessorted DB ’y’response DB ? ;no initializationvalue DW 25159float1 DD 1.234float2 DQ 123.456

2003




• Multiple definitions can be abbreviated

Example message DB ’B’ DB ’y’ DB ’e’ DB 0DH DB 0AH

can be written as

message DB ’B’,’y’,’e’,0DH,0AH

• More compactly asmessage DB ’Bye’,0DH,0AH

2003




• Multiple definitions can be cumbersome to initialize data structures such as arrays

ExampleTo declare and initialize an integer array of 8 elements marks DW 0,0,0,0,0,0,0,0

• What if we want to declare and initialize to zero an array of 200 elements? There is a better way of doing this than repeating zero

200 times in the above statement» Assembler provides a directive to do this (DUP directive)

2003




• Multiple initializations The DUP assembler directive allows multiple

initializations to the same value Previous marks array can be compactly declared as

marks DW 8 DUP (0)

Examplestable1 DW 10 DUP (?) ;10 words, uninitializedmessage DB 3 DUP (’Bye!’) ;12 bytes, initialized

; as Bye!Bye!Bye!Name1 DB 30 DUP (’?’) ;30 bytes, each

; initialized to ?

2003




• The DUP directive may also be nested

Examplestars DB 4 DUP(3 DUP (’*’),2 DUP (’?’),5 DUP (’!’))

Reserves 40-bytes space and initializes it as

***??!!!!!***??!!!!!***??!!!!!***??!!!!!

Examplematrix DW 10 DUP (5 DUP (0))

defines a 10X5 matrix and initializes its elements to 0

This declaration can also be done by

matrix DW 50 DUP (0)

2003




Symbol Table Assembler builds a symbol table so we can refer to the

allocated storage space by the associated label

Example.DATA name

offsetvalue DW 0 value 0

sum DD 0 sum 2

marks DW 10 DUP (?) marks 6

message DB ‘The grade is:’,0 message 26

char1 DB ? char1 40

2003




Correspondence to C Data Types

Directive C data type

DB char

DW int, unsigned

DD float, long

DQ double

DT internal intermediate

float value

2003




LABEL Directive LABEL directive provides another way to name a

memory location Format:

name LABEL type

type can beBYTE 1 byteWORD 2 bytesDWORD 4 bytesQWORD 8 bytesTWORD 10 bytes

2003




LABEL DirectiveExample

.DATAcount LABEL WORDLo-count DB 0Hi_count DB 0

.CODE...mov Lo_count,ALmov Hi_count,CL

count refers to the 16-bit value Lo_count refers to the low byte Hi_count refers to the high byte

2003



Where Are the Operands?

• Operands required by an operation can be specified in a variety of ways

• A few basic ways are: operand in a register

– register addressing mode operand in the instruction itself

– immediate addressing mode operand in memory

– variety of addressing modesdirect and indirect addressing modes

operand at an I/O port

2003



Where Are the Operands? (cont’d)

Register addressing mode Operand is in an internal register

Examplesmov EAX,EBX ; 32-bit copy

mov BX,CX ; 16-bit copy

mov AL,CL ; 8-bit copy

The mov instruction

mov destination,source

copies data from source to destination

2003




Register addressing mode (cont’d) Most efficient way of specifying an operand

» No memory access is required

Instructions using this mode tend to be shorter» Fewer bits are needed to specify the register

• Compilers use this mode to optimize code total := 0

for (i = 1 to 400)

total = total + marks[i]

end for Mapping total and i to registers during the for loop optimizes

the code

2003




Immediate addressing mode Data is part of the instruction

» Ooperand is located in the code segment along with the instruction

» Efficient as no separate operand fetch is needed

» Typically used to specify a constant

Examplemov AL,75

This instruction uses register addressing mode for destination and immediate addressing mode for the source

2003




Direct addressing mode Data is in the data segment

» Need a logical address to access data

– Two components: segment:offset

» Various addressing modes to specify the offset component

– offset part is called effective address

The offset is specified directly as part of instruction We write assembly language programs using memory

labels (e.g., declared using DB, DW, LABEL,...)» Assembler computes the offset value for the label

– Uses symbol table to compute the offset of a label

2003




Direct addressing mode (cont’d)Examples

mov AL,response» Assembler replaces response by its effective address (i.e., its

offset value from the symbol table)

mov table1,56» table1 is declared as

table1 DW 20 DUP (0)

» Since the assembler replaces table1 by its effective address, this instruction refers to the first element of table1

– In C, it is equivalent totable1[0] = 56

2003




Direct addressing mode (cont’d)• Problem with direct addressing

Useful only to specify simple variables Causes serious problems in addressing data types such

as arrays» As an example, consider adding elements of an array

– Direct addressing does not facilitate using a loop structure to iterate through the array

– We have to write an instruction to add each element of the array

• Indirect addressing mode remedies this problem

2003




Indirect addressing mode• The offset is specified indirectly via a register

Sometimes called register indirect addressing mode For 16-bit addressing, the offset value can be in one of

the three registers: BX, SI, or DI For 32-bit addressing, all 32-bit registers can be used

Examplemov AX,[BX]

Square brackets [ ] are used to indicate that BX is holding an offset value

» BX contains a pointer to the operand, not the operand itself

2003




• Using indirect addressing mode, we can process arrays using loops

Example: Summing array elements Load the starting address (i.e., offset) of the array into

BX Loop for each element in the array

» Get the value using the offset in BX

– Use indirect addressing

» Add the value to the running total

» Update the offset in BX to point to the next element of the array

2003




Loading offset value into a register• Suppose we want to load BX with the offset value

of table1• We cannot write

mov BX,table1

• Two ways of loading offset value» Using OFFSET assembler directive

– Executed only at the assembly time» Using lea instruction

– This is a processor instruction– Executed at run time

2003




Loading offset value into a register (cont’d)• Using OFFSET assembler directive

The previous example can be written as

mov BX,OFFSET table1

• Using lea (load effective address) instruction The format of lea instruction is

lea register,source The previous example can be written as

lea BX,table1

2003




Loading offset value into a register (cont’d)Which one to use -- OFFSET or lea?

Use OFFSET if possible» OFFSET incurs only one-time overhead (at assembly time)» lea incurs run time overhead (every time you run the program)

May have to use lea in some instances» When the needed data is available at run time only

– An index passed as a parameter to a procedure» We can write

lea BX,table1[SI]to load BX with the address of an element of table1 whose index is in SI register

» We cannot use the OFFSET directive in this case

2003



Default Segments

• In register indirect addressing mode 16-bit addresses

» Effective addresses in BX, SI, or DI is taken as the offset into the data segment (relative to DS)

» For BP and SP registers, the offset is taken to refer to the stack segment (relative to SS)

32-bit addresses» Effective address in EAX, EBX, ECX, EDX, ESI, and EDI is

relative to DS

» Effective address in EBP and ESP is relative to SS

push and pop are always relative to SS

2003



Default Segments (cont’d)

• Default segment override Possible to override the defaults by using override

prefixes» CS, DS, SS, ES, FS, GS

Example 1» We can use

add AX,SS:[BX] to refer to a data item on the stack

Example 2» We can use

add AX,DS:[BP] to refer to a data item in the data segment

2003



Data Transfer Instructions

• We will look at three instructions mov (move)

» Actually copy

xchg (exchange)» Exchanges two operands

xlat (translate)» Translates byte values using a translation table

• Other data transfer instructions such asmovsx (move sign extended)

movzx (move zero extended)

2003



Data Transfer Instructions (cont’d)

The mov instruction The format is

mov destination,source» Copies the value from source to destination»source is not altered as a result of copying

» Both operands should be of same size

»source and destination cannot both be in memory

– Most Pentium instructions do not allow both operands to be located in memory

– Pentium provides special instructions to facilitate memory-to-memory block copying of data

2003




The mov instruction Five types of operand combinations are allowed:

Instruction type Example

mov register,register mov DX,CX

mov register,immediate mov BL,100

mov register,memory mov BX,count

mov memory,register mov count,SI

mov memory,immediate mov count,23

The operand combinations are valid for all instructions that require two operands

2003




Ambiguous moves: PTR directive• For the following data definitions

.DATA

table1 DW 20 DUP (0)

status DB 7 DUP (1)

the last two mov instructions are ambiguous

mov BX,OFFSET table1

mov SI,OFFSET status

mov [BX],100

mov [SI],100 Not clear whether the assembler should use byte or word

equivalent of 100

2003




Ambiguous moves: PTR directive• The PTR assembler directive can be used to

clarify• The last two mov instructions can be written as

mov WORD PTR [BX],100mov BYTE PTR [SI],100

WORD and BYTE are called type specifiers

• We can also use the following type specifiers:DWORD for doubleword valuesQWORD for quadword valuesTWORD for ten byte values

2003




The xchg instruction• The syntax is

xchg operand1,operand2

Exchanges the values of operand1 and operand2

Examplesxchg EAX,EDXxchg response,CLxchg total,DX

• Without the xchg instruction, we need a temporary register to exchange values using only the mov instruction

2003




The xchg instruction• The xchg instruction is useful for conversion of

16-bit data between little endian and big endian forms Example:

mov AL,AHconverts the data in AX into the other endian form

• Pentium provides bswap instruction to do similar conversion on 32-bit data

bswap 32-bit register bswap works only on data located in a 32-bit register

2003




The xlat instruction• The xlat instruction translates bytes• The format is

xlatb

• To use xlat instruction» BX should be loaded with the starting address of the translation table

» AL must contain an index in to the table

– Index value starts at zero

» The instruction reads the byte at this index in the translation table and stores this value in AL

– The index value in AL is lost

» Translation table can have at most 256 entries (due to AL)

2003




The xlat instructionExample: Encrypting digits

Input digits: 0 1 2 3 4 5 6 7 8 9Encrypted digits: 4 6 9 5 0 3 1 8 7 2

.DATAxlat_table DB ’4695031872’...

.CODEmov BX,OFFSET xlat_tableGetCh ALsub AL,’0’ ; converts input character to indexxlatb ; AL = encrypted digit characterPutCh AL ...

2003



Pentium Assembly Instructions

• Pentium provides several types of instructions• Brief overview of some basic instructions:

Arithmetic instructions Jump instructions Loop instruction Logical instructions Shift instructions Rotate instructions

• These instructions allow you to write reasonable assembly language programs

2003



Arithmetic Instructions

INC and DEC instructions Format:

inc destination dec destination

Semantics:destination = destination +/-

1» destination can be 8-, 16-, or 32-bit operand, in memory

or registerNo immediate operand

• Examplesinc BX

dec value

2003



Arithmetic Instructions (cont’d)

Add instructions Format:

add destination,source

Semantics:destination = destination + source

• Examplesadd EBX,EAX

add value,35

inc EAX is better than add EAX,1– inc takes less space

– Both execute at about the same speed

2003




Add instructions Addition with carry Format:

adc destination,source

Semantics:destination = destination + source + CF

• Example: 64-bit additionadd EAX,ECX ; add lower 32 bits

adc EBX,EDX ; add upper 32 bits with carry

64-bit result in EBX:EAX

2003




Subtract instructions Format:

sub destination,source

Semantics:destination = destination - source

• Examplessub EBX,EAX

sub value,35

dec EAX is better than sub EAX,1– dec takes less space

– Both execute at about the same speed

2003




Subtract instructions Subtract with borrow Format:

sbb destination,source Semantics:

destination = destination - source - CF Like the adc, sbb is useful in dealing with more than

32-bit numbers• Negation

neg destination Semantics:

destination = 0 - destination

2003




CMP instruction Format:

cmp destination,source

Semantics:destination - source

destination and source are not altered Useful to test relationship (>, =) between two operands Used in conjunction with conditional jump instructions

for decision making purposes• Examples

cmp EBX,EAX cmp count,100

2003



Unconditional Jump

Format:jmp label

Semantics:» Execution is transferred to the instruction identified by label

• Target can be specified in one of two ways Directly

» In the instruction itself

Indirectly» Through a register or memory

2003



Unconditional Jump (cont’d)

Example

• Two jump instructions Forward jump

jmp CX_init_done

Backward jumpjmp repeat1

• Programmer specifies target by a label

• Assembler computes the offset using the symbol table

. . .

mov CX,10

jmp CX_init_done

init_CX_20:

mov CX,20

CX_init_done:

mov AX,CX

repeat1:

dec CX

. . .

jmp repeat1

. . .

2003



Unconditional Jump (cont’d)

• Address specified in the jump instruction is not the absolute address Uses relative address

» Specifies relative byte displacement between the target instruction and the instruction following the jump instruction

» Displacement is w.r.t the instruction following jmp– Reason: IP points to this instruction after reading jump

Execution of jmp involves adding the displacement value to current IP

Displacement is a signed 16-bit number» Negative value for backward jumps

» Positive value for forward jumps

2003



Target Location

• Inter-segment jump Target is in another segment

CS = target-segment (2 bytes)IP = target-offset (2 bytes)

» Called far jumps (needs five bytes to encode jmp)

• Intra-segment jumps Target is in the same segment

IP = IP + relative-displacement (1 or 2 bytes) Uses 1-byte displacement if target is within 128 to +127

» Called short jumps (needs two bytes to encode jmp) If target is outside this range, uses 2-byte displacement

» Called near jumps (needs three bytes to encode jmp)

2003



Target Location (cont’d)

• In most cases, the assembler can figure out the type of jump For backward jumps, assembler can decide whether to

use the short jump form or not

• For forward jumps, it needs a hint from the programmer Use SHORT prefix to the target label If such a hint is not given

» Assembler reserves three bytes for jmp instruction» If short jump can be used, leaves one byte of nop (no

operation)– See the next example for details

2003



Example

. . . 8 0005 EB 0C jmp SHORT CX_init_done

0013 - 0007 = 0C

9 0007 B9 000A mov CX,10

10 000A EB 07 90 jmp CX_init_done

nop 0013 - 000D = 07

11 init_CX_20:

12 000D B9 0014 mov CX,20

13 0010 E9 00D0 jmp near_jump

00E3 - 0013 = D0

14 CX_init_done:

15 0013 8B C1 mov AX,CX

2003



Example (cont’d)

16 repeat1:

17 0015 49 dec CX

18 0016 EB FD jmp repeat1

0015 - 0018 = -3 = FDH

. . .

84 00DB EB 03 jmp SHORT short_jump

00E0 - 00DD = 3

85 00DD B9 FF00 mov CX, 0FF00H

86 short_jump:

87 00E0 BA 0020 mov DX, 20H

88 near_jump:

89 00E3 E9 FF27 jmp init_CX_20

000D - 00E6 = -217 = FF27H

2003



Conditional Jumps (cont’d)

Format:j<cond> lab

– Execution is transferred to the instruction identified by label only if <cond> is met

• Example: Testing for carriage returnread_char:

. . . cmp AL,0DH ; 0DH = ASCII carriage return je CR_received inc CL jmp read_char . . .CR_received:

2003




Some conditional jump instructions– Treats operands of the CMP instruction as signed numbers

je jump if equaljg jump if greaterjl jump if lessjge jump if greater or equaljle jump if less or equaljne jump if not equal

2003




Conditional jump instructions can also test values of the individual flags

jz jump if zero (i.e., if ZF = 1)jnz jump if not zero (i.e., if ZF = 0) jc jump if carry (i.e., if CF = 1)jnc jump if not carry (i.e., if CF = 0)

jz is synonymous for je jnz is synonymous for jne

2003



A Note on Conditional Jumps

target:

. . .

cmp AX,BX

je target

mov CX,10

. . .

traget is out of range for a short jump

• Use this code to get around

target:

. . .

cmp AX,BX

jne skip1

jmp target

skip1:

mov CX,10

. . .

• All conditional jumps are encoded using 2 bytes Treated as short jumps

• What if the target is outside this range?

2003



Loop Instructions

Unconditional loop instruction Format:

loop target

Semantics:» Decrements CX and jumps to target if CX 0

– CX should be loaded with a loop count value• Example: Executes loop body 50 times

mov CX,50

repeat:

<loop body>

loop repeat ...

2003



Loop Instructions (cont’d)

• The previous example is equivalent to mov CX,50

repeat:

<loop body>

dec CX

jnz repeat ...

Surprisingly, dec CX

jnz repeat

executes faster than loop repeat

2003



Loop Instructions (cont’d)

• Conditional loop instructions loope/loopz

» Loop while equal/zero

CX = CX – 1ff (CX = 0 and ZF = 1) jump to target

loopne/loopnz» Loop while not equal/not zero

CX = CX – 1ff (CX = 0 and ZF = 0) jump to target

2003



Logical Instructions

Format:and destination,sourceor destination,sourcexor destination,sourcenot destination

Semantics:» Performs the standard bitwise logical operations

– result goes to destination test is a non-destructive and instruction

test destination,source

Performs logical AND but the result is not stored in destination (like the CMP instruction)

2003



Logical Instructions (cont’d)

Example: . . .

and AL,01H ; test the least significant bit

jz bit_is_zero

<bit 1 code>

jmp skip1

bit_is_zero:

<bit 0 code>

skip1:

. . .

• test instruction is better in place of and

2003



Shift Instructions

• Two types of shifts» Logical» Arithmetic

Logical shift instructionsShift left

shl destination,count shl destination,CL

Shift rightshr destination,count shr destination,CL

Semantics:» Performs left/right shift of destination by the value in count or CL register

– CL register contents are not altered

2003



Shift Instructions (cont’d)

Logical shift Bit shifted out goes into the carry flag

» Zero bit is shifted in at the other end

2003




count is an immediate valueshl AX,5

Specification of count greater than 31 is not allowed» If a greater value is specified, only the least significant 5 bits

are used

CL version is useful if shift count is known at run time» Ex: when the shift count value is passed as a parameter in a

procedure call

» Only the CL register can be usedShift count value should be loaded into CL

mov CL,5

shl AX,CL

2003




Arithmetic shift Two versions as in logical shift

sal/sar destination,count

sal/sar destination,CL

2003



Double Shift Instructions

• Double shift instructions work on either 32- or 64-bit operands

• Format Takes three operands

shld dest,src,count ; left shift

shrd dest,src,count ; right shift dest can be in memory or register src must be a register count can be an immediate value or in CL as in other

shift instructions

2003



Double Shift Instructions (cont’d)

src is not modified by doubleshift instruction Only dest is modified Shifted out bit goes into the carry flag

2003



Rotate Instructions

Two types of ROTATE instructions Rotate without carry

» rol (ROtate Left)» ror (ROtate Right)

Rotate with carry» rcl (Rotate through Carry Left)» rcr (Rotate through Carry Right)

Format of ROTATE instructions is similar to the SHIFT instructions

» Supports two versions– Immediate count value– Count value in CL register

2003



Rotate Instructions (cont’d)

Bit shifted out goes into the carry flag as in SHIFT instructions

2003



Rotate Instructions (cont’d)

• Example: Shifting 64-bit numbers Multiplies a 64-bit value in EDX:EAX by 16

» Rotate versionmov CX,4

shift_left:shl EAX,1rcl EDX,1loop shift_left

» Doubleshift versionshld EDX,EAX,4shl EAX,4

• Division can be done in a similar a way

2003



Defining Constants

• Assembler provides two directives:» EQU directive

– No reassignment– String constants can be defined

» = directive– Can be reassigned– No string constants

• Defining constants has two advantages: Improves program readability Helps in software maintenance

» Multiple occurrences can be changed from a single place

• Convention» We use all upper-case letters for names of constants

2003



Defining Constants (cont’d)

The EQU directive• Syntax:

name EQU expression Assigns the result of expression to name The expression is evaluated at assembly time

Similar to #define in CExamples

NUM_OF_ROWS EQU 50NUM_OF_COLS EQU 10ARRAY_SIZE EQU NUM_OF_ROWS * NUM_OF_COLS

Can also be used to define string constantsJUMP EQU jmp

2003



Defining Constants (cont’d)

The = directive• Syntax:

name = expression Similar to EQU directive Two key differences:

» Redefinition is allowedcount = 0. . .count = 99

is valid» Cannot be used to define string constants or to redefine

keywords or instruction mnemonics

Example: JUMP = jmp is not allowed

2003



Macros

• Macros can be defined with MACRO and ENDM• Format

macro_name MACRO[parameter1, parameter2,...] macro body

ENDM

• A macro can be invoked usingmacro_name [argument1, argument2, …]

Example: Definition InvocationmultAX_by_16 MACRO ...

sal AX,4 mov AX,27

ENDM multAX_by_16

...

2003



Macros (cont’d)

• Macros can be defined with parameters» More flexible

» More useful

• Examplemult_by_16 MACRO operand

sal operand,4

ENDM To multiply a byte in DL register

mult_by_16 DL

To multiply a memory variable count

mult_by_16 count

2003



Macros (cont’d)

Example: To exchange two memory words Memory-to-memory transfer

Wmxchg MACRO operand1, operand2

xchg AX,operand1

xchg AX,operand2

xchg AX,operand1

ENDM

2003



Illustrative Examples

• Five examples in this chapter Conversion of ASCII to binary representation

(BINCHAR.ASM) Conversion of ASCII to hexadecimal by character

manipulation (HEX1CHAR.ASM) Conversion of ASCII to hexadecimal using the XLAT

instruction (HEX2CHAR.ASM) Conversion of lowercase letters to uppercase by

character manipulation (TOUPPER.ASM) Sum of individual digits of a number

(ADDIGITS.ASM)Last slide

Procedures and the Stack

2003



What is a Stack?

• Stack is a last-in-first-out (LIFO) data structure If we view the stack as a linear array of elements, both

insertion and deletion operations are restricted to one end of the array

Only the element at the top-of-stack (TOS) is directly accessible

• Two basic stack operations push

» Insertion

pop » Deletion

2003



What is a Stack? (cont’d)

• Example Insertion of data items into the stack

» Arrow points to the top-of-stack

2003



What is a Stack? (cont’d)

• Example Deletion of data items from the stack

» Arrow points to the top-of-stack

2003



Pentium Implementation of the Stack

• Stack segment is used to implement the stack Registers SS and (E)SP are used SS:(E)SP represents the top-of-stack

• Pentium stack implementation characteristics are Only words (i.e., 16-bit data) or doublewords (i.e., 32-

bit data) are saved on the stack, never a single byte Stack grows toward lower memory addresses

» Stack grows “downward”

Top-of-stack (TOS) always points to the last data item placed on the stack

2003



Pentium Stack Example - 1

2003



Pentium Stack Example - 2

2003



Pentium Stack Instructions

• Pentium provides two basic instructions:push source

pop destination

source and destination can be a» 16- or 32-bit general register

» a segment register

» a word or doubleword in memory

source of push can also be an immediate operand of size 8, 16, or 32 bits

2003



Pentium Stack Instructions: Examples

• On an empty stack created by

.STACK 100Hthe following sequence of push instructions

push 21ABH

push 7FBD329AH

results in the stack state shown in (a) in the last figure

• On this stack, executingpop EBX

results in the stack state shown in (b) in the last figure

and the register EBX gets the value 7FBD329AH

2003



Additional Pentium Stack Instructions

Stack Operations on Flags• push and pop instructions cannot be used on the

Flags register• Two special instructions for this purpose are

pushf (push 16-bit flags)

popf (pop 16-bit flags)

• No operands are required• Use pushfd and popfd for 32-bit flags

(EFLAGS)

2003



Additional Pentium Stack Instructions (cont’d)

Stack Operations on 8 General-Purpose Registers

• pusha and popa instructions can be used to save and restore the eight general-purpose registers

AX, CX, DX, BX, SP, BP, SI, and DI

• pusha pushes these eight registers in the above order (AX first and DI last)

• popa restores these registers except that SP value is not loaded into the SP register

• Use pushad and popad for saving and restoring 32-bit registers

2003



Uses of the Stack

• Three main uses» Temporary storage of data» Transfer of control» Parameter passing

Temporary Storage of DataExample: Exchanging value1 and value2 can be

done by using the stack to temporarily hold datapush value1push value2pop value1pop value2

2003



Uses of the Stack (cont’d)

• Often used to free a set of registers

;save EBX & ECX registers on the stack

push EBX

push ECX. . . . . .

<<EBX and ECX can now be used>>. . . . . .

;restore EBX & ECX from the stack

pop ECX

pop EBX

2003



Uses of the Stack (cont’d)

Transfer of Control• In procedure calls and interrupts, the return

address is stored on the stack Our discussion on procedure calls clarifies this

particular use of the stack

Parameter Passing• Stack is extensively used for parameter passing

Our discussion later on parameter passing describes how the stack is used for this purpose

2003



Assembler Directives for Procedures

• Assembler provides two directives to define procedures: PROC and ENDP

• To define a NEAR procedure, useproc-name PROC NEAR

In a NEAR procedure, both calling and called procedures are in the same code segment

• A FAR procedure can be defined byproc-name PROC FAR

Called and calling procedures are in two different segments in a FAR procedure

2003



Assembler Directives for Procedures (cont’d)

• If FAR or NEAR is not specified, NEAR is assumed (i.e., NEAR is the default)

• We focus on NEAR procedures• A typical NAER procedure definition

proc-name PROC . . . . .

<procedure body> . . . . .

proc-name ENDP

proc-name should match in PROC and ENDP

2003



Pentium Instructions for Procedures

• Pentium provides two instructions: call and ret• call instruction is used to invoke a procedure• The format is

call proc-nameproc-name is the procedure name

• Actions taken during a near procedure call

SP = SP 2(SS:SP) = IPIP = IP + relative displacement

2003



Pentium Instructions for Procedures (cont’d)

• ret instruction is used to transfer control back to the calling procedure

• How will the processor know where to return? Uses the return address pushed onto the stack as part of

executing the call instruction Important that TOS points to this return address when ret instruction is executed

• Actions taken during the execution of ret are:

IP = (SS:SP) SP = SP + 2

2003



Pentium Instructions for Procedures (cont’d)

• We can specify an optional integer in the ret instruction The format is

ret optional-integer

Example:

ret 6• Actions taken on ret with optional-integer are:

IP = (SS:SP)

SP = SP + 2 + optional-integer

2003



How Is Program Control Transferred?

Offset(hex) machine code(hex)main PROC. . . . . .

cs:000A E8000C call sumcs:000D 8BD8 mov BX,AX

. . . . . .main ENDP

sum PROCcs:0019 55 push BP

. . . . . .sum ENDP

avg PROC. . . . . .

cs:0028 E8FFEE call sumcs:002B 8BD0 mov DX,AX

. . . . . .avg ENDP

2003



Parameter Passing

• Parameter passing is different and complicated than in a high-level language

• In assembly language» First place all required parameters in a mutually accessible

storage area» Then call the procedure

• Type of storage area used» Registers (general-purpose registers are used)» Memory (stack is used)

• Two common methods of parameter passing» Register method» Stack method

2003



Parameter Passing: Register Method

• Calling procedure places the necessary parameters in the general-purpose registers before invoking the procedure through the call instruction

• Examples:

PROCEX1.ASM» call-by-value using the register method

» a simple sum procedure

PROCEX2.ASM» call-by-reference using the register method» string length procedure

2003



Pros and Cons of the Register Method

• Advantages Convenient and easier Faster

• Disadvantages Only a few parameters can be passed using the register

method– Only a small number of registers are available

Often these registers are not free– Freeing them by pushing their values onto the stack

negates the second advantage

2003



Parameter Passing: Stack Method

• All parameter values are pushed onto the stack before calling the procedure

• Example:push number1push number2call sum

2003



Accessing Parameters on the Stack

• Parameter values are buried inside the stack• We cannot use

mov BX,[SP+2] ;illegalto access number2 in the previous example

• We can usemov BX,[ESP+2] ;valid

Problem: The ESP value changes with push and pop operations

» Relative offset depends of the stack operations performed

» Not desirable

2003



Accessing Parameters on the Stack (cont’d)

• We can also useadd SP,2

mov BX,[SP] ;valid

Problem: cumbersome» We have to remember to update SP to point to the return

address on the stack before the end of the procedure

• Is there a better alternative? Use the BP register to access parameters on the stack

2003



Using BP Register to Access Parameters

• Preferred method of accessing parameters on the stack is

mov BP,SP

mov BX,[BP+2]

to access number2 in the previous example• Problem: BP contents are lost!

We have to preserve the contents of BP Use the stack (caution: offset value changes)

push BP

mov BP,SP

2003



Clearing the Stack Parameters

Stack state after push BP

Stack state after pop BP

Stack state afterexecuting ret

2003



Clearing the Stack Parameters (cont’d)

• Two ways of clearing the unwanted parameters on the stack: Use the optional-integer in the ret instruction

» Use ret 4

in the previous example

Add the constant to SP in calling procedure (C uses this method)

push number1push number2call sumadd SP,4

2003



Housekeeping Issues

• Who should clean up the stack of unwanted parameters? Calling procedure

» Need to update SP with every procedure call

» Not really needed if procedures use fixed number of parameters

» C uses this method because C allows variable number of parameters

Called procedure» Code becomes modular (parameter clearing is done in only

one place)

» Cannot be used with variable number of parameters

2003



Housekeeping Issues (cont’d)

• Need to preserve the state across a procedure call» Stack is used for this purpose

• Which registers should be saved? Save those registers that are used by the calling

procedure but are modified by the called procedure» Might cause problems

Save all registers (brute force method) » Done by using pusha

» Increased overhead

– pusha takes 5 clocks as opposed 1 to save a register

2003




Stack state after pusha

2003




• Who should preserve the state of the calling procedure? Calling procedure

» Need to know the registers used by the called procedure

» Need to include instructions to save and restore registers with every procedure call

» Causes program maintenance problems

Called procedure» Preferred method as the code becomes modular (state

preservation is done only once and in one place)

» Avoids the program maintenance problems mentioned

2003



Stack Frame Instructions

• ENTER instruction Facilitates stack frame (discussed later) allocation

enter bytes,levelbytes = local storage space

level = nesting level (we use 0) Example

enter XX,0Equivalent to

push BPmov BP,SPsub SP,XX

2003



Stack Frame Instructions (cont’d)

• LEAVE instruction Releases stack frame

leave» Takes no operands

» Equivalent to

mov SP,BP

pop BP

2003



A Typical Procedure Template

proc-name PROC

enter XX,0

. . . . . .

<procedure body>

. . . . . .

leave

ret YY

proc-name ENDP

2003



Stack Parameter Passing: Examples

• PROCEX3.ASM call-by-value using the stack method a simple sum procedure

• PROCSWAP.ASM call-by-reference using the stack method first two characters of the input string are swapped

• BBLSORT.ASM implements bubble sort algorithm uses pusha and popa to save and restore registers

2003



Variable Number of Parameters

• For most procedures, the number of parameters is fixed Same number of arguments in each call

• In procedures that can have variable number of parameters Number of arguments can vary from call to call C supports procedures with variable number of

parameters

• Easy to support variable number of parameters using the stack method

2003



Variable Number of Parameters (cont’d)

• To implement variable number of parameter passing: Parameter count should

be one of the parameters passed

This count should be the last parameter pushed onto the stack

2003



Local Variables

• Local variables are dynamic in nature Come into existence when the procedure is invoked Disappear when the procedure terminates

• Cannot reserve space for these variable in the data segment for two reasons:

» Such space allocation is static

– Remains active even when the procedure is not

» It does not work with recursive procedures

• Space for local variables is reserved on the stack

2003



Local Variables (cont’d)

Example

• N and temp Two local

variables

Each requires two bytes of storage

2003



Local Variables (cont’d)

• The information stored in the stack» parameters» returns address» old BP value» local variables

is collectively called stack frame

• In high-level languages, stack frame is also referred to as the activation record

» Each procedure activation requires all this information

• The BP value is referred to as the frame pointer» Once the BP value is known, we can access all the data in the

stack frame

2003



Local Variables: Examples

• PROCFIB1.ASM For simple procedures, registers can also be used for

local variable storage Uses registers for local variable storage Outputs the largest Fibonacci number that is less than

the given input number

• PROCFIB2.ASM Uses the stack for local variable storage Performance implications of using registers versus

stack are discussed later

2003



Multiple Module Programs

• In multi-module programs, a single program is split into multiple source files

• Advantages» If a module is modified, only that module needs to be

reassembled (not the whole program)

» Several programmers can share the work

» Making modifications is easier with several short files

» Unintended modifications can be avoided

• To facilitate separate assembly, two assembler directives are provided:

» PUBLIC and EXTRN

2003



PUBLIC Assembler Directive

• The PUBLIC directive makes the associated labels public

» Makes these labels available for other modules of the program

• The format isPUBLIC label1, label2, . . .

• Almost any label can be made public including» procedure names

» variable names

» equated labels

• In the PUBLIC statement, it is not necessary to specify the type of label

2003



Example: PUBLIC Assembler Directive

. . . . . PUBLIC error_msg, total, sample

. . . . . .DATAerror_msg DB “Out of range!”,0total DW 0

. . . . . .CODE

. . . . . sample PROC

. . . . . sample ENDP

. . . . .

2003



EXTRN Assembler Directive

• The EXTRN directive tells the assembler that certain labels are not defined in the current module The assembler leaves “holes” in the OBJ file for the

linker to fill in later on

• The format isEXTRN label:type

where label is a label made public by a PUBLIC directive in some other module and

type is the type of the label

2003



EXTRN Assembler Directive (cont’d)

Type Description

UNKNOWN Undetermined or unknown type

BYTE Data variable (size is 8 bits)

WORD Data variable (size is 16 bits)

DWORD Data variable (size is 32 bits)

QWORD Data variable (size is 64 bits)

FWORD Data variable (size is 6 bytes)

TBYTE Data variable (size is 10 bytes)

PROC A procedure name

(NEAR or FAR according to .MODEL)

NAER A near procedure name

FAR A far procedure name

2003



EXTRN Assembler Directive (cont’d)

Example.MODEL SMALL

. . . .

EXTRN error_msg:BYTE, total:WORD

EXTRN sample:PROC

. . . .

Note: EXTRN (not EXTERN)

Examplemodule1.asm (main procedure)

module2.asm (string-length procedure)Last slide

Addressing Modes

2003



Addressing Modes

• Addressing mode refers to the specification of the location of data required by an operation

• Pentium supports three fundamental addressing modes: Register mode Immediate mode Memory mode

• Specification of operands located in memory can be done in a variety of ways Mainly to support high-level language constructs and

data structures

2003



Pentium Addressing Modes (32-bit Addresses)

2003



Memory Addressing Modes (16-bit Addresses)

2003



Simple Addressing Modes

• Register addressing mode Operands are located in registers It is the most efficient addressing mode

• Immediate addressing mode Operand is stored as part of the instruction

» This mode is used mostly for constants It imposes several restrictions Efficient as the data comes with the instructions

» Instructions are generally prefetched

• Both addressing modes are discussed before

2003



Memory Addressing Modes

• Pentium offers several addressing modes to access operands located in memory

» Primary reason: To efficiently support high-level language constructs and data structures

• Available addressing modes depend on the address size used 16-bit modes (shown before)

» same as those supported by 8086

32-bit modes (shown before)» supported by Pentium

» more flexible set

2003



32-Bit Addressing Modes

• These addressing modes use 32-bit registers

Segment + Base + (Index * Scale) + displacementCS EAX EAX 1 no displacementSS EBX EBX 2 8-bit displacementDS ECX ECX 4 32-bit

displacementES EDX EDX 8FS ESI ESIGS EDI EDI

EBP EBPESP

2003



Differences between 16- and 32-bit Modes

16-bit addressing 32-bit addressing

Base register BX, BP EAX, EBX, ECX,EDX, ESI, EDI,EBP, ESP

Index register SI, DI EAX, EBX, ECX,EDX, ESI, EDI,EBP

Scale factor None 1, 2, 4, 8

Displacement 0, 8, 16 bits 0, 8, 32 bits

2003



16-bit or 32-bit Addressing Mode?

• How does the processor know?• Uses the D bit in the CS segment descriptor

D = 0» default size of operands and addresses is 16 bits

D = 1» default size of operands and addresses is 32 bits

• We can override these defaults Pentium provides two size override prefixes

66H operand size override prefix 67H address size override prefix

• Using these prefixes, we can mix 16- and 32-bit data and addresses

2003



Examples: Override Prefixes

• Our default mode is 16-bit data and addresses

Example 1: Data size overridemov AX,123 ==> B8 007B

mov EAX,123 ==> 66 | B8 0000007B

Example 2: Address size overridemov AX,[EBX*ESI+2] ==> 67 | 8B0473

Example 3: Address and data size overridemov EAX,[EBX*ESI+2] ==> 66 | 67 | 8B0473

2003



Memory Addressing Modes

• Direct addressing mode Offset is specified as part of the instruction

– Assembler replaces variable names by their offset values

– Useful to access only simple variables

Exampletotal_marks =

assign_marks + test_marks + exam_markstranslated into

mov EAX,assign_marks

add EAX,test_marks

add EAX,exam_marks

mov total_marks,EAX

2003



Memory Addressing Modes (cont’d)

• Register indirect addressing mode Effective address is placed in a general-purpose register

In 16-bit segments» only BX, SI, and DI are allowed to hold an effective address

add AX,[BX] is validadd AX,[CX] is NOT allowed

In 32-bit segments» any of the eight 32-bit registers can hold an effective address

add AX,[ECX] is valid

2003



Memory Addressing Modes (cont’d)

• Default Segments 16-bit addresses

» BX, SI, DI : data segment

» BP, SP : stack segment

32-bit addresses» EAX, EBX, ECX, EDX, ESI, EDI: data segment

» EBP, ESP: stack segment

• Possible to override these defaults Pentium provides segment override prefixes

2003



Based Addressing

• Effective address is computed asbase + signed displacement

Displacement:– 16-bit addresses: 8- or 16-bit number– 32-bit addresses: 8- or 32-bit number

• Useful to access fields of a structure or record» Base register points to the base address of the structure» Displacement relative offset within the structure

• Useful to access arrays whose element size is not 2, 4, or 8 bytes

» Displacement points to the beginning of the array» Base register relative offset of an element within the array

2003



Based Addressing (cont’d)

2003



Indexed Addressing

• Effective address is computed as(index * scale factor) + signed displacement

16-bit addresses:– displacement: 8- or 16-bit number– scale factor: none (i.e., 1)

32-bit addresses:– displacement: 8- or 32-bit number– scale factor: 2, 4, or 8

• Useful to access elements of an array (particularly if the element size is 2, 4, or 8 bytes)

» Displacement points to the beginning of the array» Index register selects an element of the array (array index)» Scaling factor size of the array element

2003



Indexed Addressing (cont’d)

Examplesadd AX,[DI+20]

– We have seen similar usage to access parameters off the stack (in Chapter 10)

add AX,marks_table[ESI*4]– Assembler replaces marks_table by a constant (i.e.,

supplies the displacement)

– Each element of marks_table takes 4 bytes (the scale factor value)

– ESI needs to hold the element subscript value

add AX,table1[SI]– SI needs to hold the element offset in bytes– When we use the scale factor we avoid such byte counting

2003



Based-Indexed Addressing

Based-indexed addressing with no scale factor• Effective address is computed as

base + index + signed displacement

• Useful in accessing two-dimensional arrays» Displacement points to the beginning of the array

» Base and index registers point to a row and an element within that row

• Useful in accessing arrays of records» Displacement represents the offset of a field in a record

» Base and index registers hold a pointer to the base of the array and the offset of an element relative to the base of the array

2003



Based-Indexed Addressing (cont’d)

• Useful in accessing arrays passed on to a procedure

» Base register points to the beginning of the array» Index register represents the offset of an element relative to

the base of the array

ExampleAssuming BX points to table1

mov AX,[BX+SI]

cmp AX,[BX+SI+2] compares two successive elements of table1

2003



Based-Indexed Addressing (cont’d)

Based-indexed addressing with scale factor• Effective address is computed as

base + (index * scale factor) + signed displacement

• Useful in accessing two-dimensional arrays when the element size is 2, 4, or 8 bytes

» Displacement ==> points to the beginning of the array

» Base register ==> holds offset to a row (relative to start of array)

» Index register ==> selects an element of the row

» Scaling factor ==> size of the array element

2003




• Insertion sort ins_sort.asm Sorts an integer array using insertion sort algorithm

» Inserts a new number into the sorted array in its right place

• Binary search bin_srch.asm Uses binary search to locate a data item in a sorted

array» Efficient search algorithm

2003



Arrays

One-Dimensional Arrays

• Array declaration in HLL (such as C)

int test_marks[10];

specifies a lot of information about the array:

» Name of the array (test_marks)

» Number of elements (10)

» Element size (2 bytes)

» Interpretation of each element (int i.e., signed integer)

» Index range (0 to 9 in C)

• You get very little help in assembly language!

2003



Arrays (cont’d)

• In assembly language, declaration such astest_marks DW 10 DUP (?)

only assigns name and allocates storage space. You, as the assembly language programmer, have to

“properly” access the array elements by taking element size and the range of subscripts.

• Accessing an array element requires its displacement or offset relative to the start of the array in bytes

2003



Arrays (cont’d)

• To compute displacement, we need to know how the array is laid out

» Simple for 1-D arrays

• Assuming C style subscriptsdisplacement = subscript *

element size in bytes

• If the element size is 2, 4, or 8 bytes a scale factor can be used to

avoid counting displacement in bytes

2003



Multidimensional Arrays

• We focus on two-dimensional arrays» Our discussion can be generalized to higher dimensions

• A 53 array can be declared in C asint class_marks[5][3];

• Two dimensional arrays can be stored in one of two ways: Row-major order

– Array is stored row by row– Most HLL including C and Pascal use this method

Column-major order– Array is stored column by column– FORTRAN uses this method

2003



Multidimensional Arrays (cont’d)

2003




• Why do we need to know the underlying storage representation?

» In a HLL, we really don’t need to know» In assembly language, we need this information as we have to

calculate displacement of element to be accessed

• In assembly language,class_marks DW 5*3 DUP (?)

allocates 30 bytes of storage• There is no support for using row and column

subscripts» Need to translate these subscripts into a displacement value

2003




• Assuming C language subscript convention, we can express displacement of an element in a 2-D array at row i and column j as

displacement = (i * COLUMNS + j) * ELEMENT_SIZE

whereCOLUMNS = number of columns in the array

ELEMENT_SIZE = element size in bytes

Example: Displacement of class_marks[3,1]

element is (3*3 + 1) * 2 = 20

2003



Examples of Arrays

Example 1• One-dimensional array

» Computes array sum (each element is 4 bytes long e.g., long integers)

» Uses scale factor 4 to access elements of the array by using a 32-bit addressing mode (uses ESI rather than SI)

» Also illustrates the use of predefined location counter $

Example 2• Two-dimensional array

» Finds sum of a column

» Uses “based-indexed addressing with scale factor” to access elements of a column

2003



Recursion

• A recursive procedure calls itself Directly, or Indirectly

• Some applications can be naturally expressed using recursion

factorial(0) = 1

factorial (n) = n * factorial(n1) for n > 0

• From implementation viewpoint Very similar to any other procedure call

» Activation records are stored on the stack

2003



Recursion (cont’d)

2003



Recursion (cont’d)

• Example 1 Factorial

» Discussed before

• Example 2 Quicksort (on an N-element array) Basic algorithm

» Selects a partition element x

» Assume that the final position of x is array[i]

» Moves elements less than x into array[0]…array[i1]

» Moves elements greater than x into array[i+1]…array[N1]

» Applies quicksort recursively to sort these two subarraysLast slide

Selected Pentium Instructions

2003



Status Flags

2003



Status Flags (cont’d)

• Status flags are updated to indicate certain properties of the result Example: If the result is zero, zero flag is set

• Once a flag is set, it remains in that state until another instruction that affects the flags is executed

• Not all instructions affect all status flags add and sub affect all six flags inc and dec affect all but the carry flag mov, push, and pop do not affect any flags

2003




• Example; initially, assume ZF = 0

mov AL,55H ; ZF is still zero

sub AL,55H ; result is 0 ; ZF is set (ZF = 1)

push BX ; ZF remains 1

mov BX,AX ; ZF remains 1

pop DX ; ZF remains 1

mov CX,0 ; ZF remains 1

inc CX ; result is 1 ; ZF is cleared (ZF = 0)

2003




• Zero Flag Indicates zero result

– If the result is zero, ZF = 1

– Otherwise, ZF = 0

Zero can result in several ways (e.g. overflow) mov AL,0FH mov AX,0FFFFH mov AX,1

add AL,0F1H inc AX dec AX» All three examples result in zero result and set ZF

Related instructionsjz jump if zero (jump if ZF = 1)

jnz jump if not zero (jump if ZF = 0)

2003




• Uses of zero flag Two main uses of zero flag

» Testing equality

– Often used with cmp instruction

cmp char,’$’ ; ZF = 1 if char is $

cmp AX,BX

» Counting to a preset value

– Initialize a register with the count value

– Decrement it using dec instruction

– Use jz/jnz to transfer control

2003




• Consider the following code

sum := 0

for (i = 1 to M)

for (j = 1 to N)

sum := sum + 1

end for

end for

• Assembly code

sub AX,AX ; AX := 0 mov DX,Mouter_loop: mov CX,Ninner_loop: inc AX loop inner_loop dec DX jnz outer_loopexit_loops: mov sum,AX

2003




• Two observations loop instruction is equivalent to

dec DX

jnz outer_loop» This two instruction sequence is more efficient than the loop

instruction (takes less time to execute)» loop instruction does not affect any flags!

This two instruction sequence is better than initializing DX = 1 and executing

inc DX

cmp DX,M

jle inner_loop

2003




• Carry Flag Records the fact that the result of an arithmetic

operation on unsigned numbers is out of range The carry flag is set in the following examples

mov AL,0FH mov AX,12AEH

add AL,0F1H sub AX,12AFH

Range of 8-, 16-, and 32-bit unsigned numbers

size range

8 bits 0 to 255 (28 1)

16 bits 0 to 65,535 (216 1)

32 bits 0 to 4,294,967,295 (2321)

2003




Carry flag is not set by inc and dec instructions» The carry flag is not set in the following examples

mov AL,0FFH mov AX,0

inc AL dec AX

Related instructions jc jump if carry (jump if CF = 1)

jnc jump if no carry (jump if CF = 0)

Carry flag can be manipulated directly using stc set carry flag (set CF to 1)

clc clear carry flag (clears CF to 0)

cmc complement carry flag (inverts CF value)

2003




• Uses of carry flag To propagate carry/borrow in multiword

addition/subtraction 1 carry from lower 32 bitsx = 3710 26A8 1257 9AE7Hy = 489B A321 FE60 4213H 7FAB C9CA 10B7 DCFAH

To detect overflow/underflow condition» In the last example, carry out of leftmost bit indicates overflow

To test a bit using the shift/rotate instructions» Bit shifted/rotated out is captured in the carry flag» We can use jc/jnc to test whether this bit is 1 or 0

2003




• Overflow flag Indicates out-of-range result on signed numbers

– Signed number counterpart of the carry flag The following code sets the overflow flag but not the carry

flagmov AL,72H ; 72H = 114Dadd AL,0EH ; 0EH = 14D

Range of 8-, 16-, and 32-bit signed numbers

size range

8 bits 128 to +127 27 to (27 1)16 bits 32,768 to +32,767 215 to (215 1)32 bits 2,147,483,648 to +2,147,483,647 231 to (231 1)

2003




Unsigned interpretation

mov AL,72Hadd AL,0EHjc overflow

no_overflow:

(no overflow code here) . . . .

overflow:

(overflow code here) . . . .

Signed interpretation

mov AL,72Hadd AL,0EHjo overflow

no_overflow:

(no overflow code here) . . . .

overflow:

(overflow code here) . . . .

• Signed or unsigned: How does the system know? The processor does not know the interpretation It sets carry and overflow under each interpretation

2003




Related instructions jo jump if overflow (jump if OF = 1)

jno jump if no overflow (jump if OF = 0)

There is a special software interrupt instruction into interrupt on overflow

Details on this instruction in Chapter 20

• Uses of overflow flag Main use

» To detect out-of-range result on signed numbers

2003




• Sign flag Indicates the sign of the result

– Useful only when dealing with signed numbers– Simply a copy of the most significant bit of the result

Examplesmov AL,15 mov AL,15add AL,97 sub AL,97clears the sign flag as sets the sign flag asthe result is 112 the result is 82(or 0111000 in binary) (or 10101110 in binary)

Related instructionsjs jump if sign (jump if SF = 1)jns jump if no sign (jump if SF = 0)

2003




• Consider the count down loop:

for (i = M downto 0)

<loop body>

end for

• If we don’t use the jns, we need cmp as shown below:

cmp CX,0

jl for_loop

The count down loop can be implemented as

mov CX,M

for_loop:

<loop body>

dec CX

jns for_loop

• Usage of sign flag To test the sign of the result Also useful to efficiently implement countdown loops

2003




• Auxiliary flag Indicates whether an operation produced a carry or

borrow in the low-order 4 bits (nibble) of 8-, 16-, or 32-bit operands (i.e. operand size doesn’t matter)

Example 1 carry from lower 4 bits

mov AL,43 43D = 0010 1011B

add AL,94 94D = 0101 1110B

137D = 1000 1001B

» As there is a carry from the lower nibble, auxiliary flag is set

2003




Related instructions» No conditional jump instructions with this flag» Arithmetic operations on BCD numbers use this flag

aaa ASCII adjust for additionaas ASCII adjust for subtractionaam ASCII adjust for multiplicationaad ASCII adjust for divisiondaa Decimal adjust for additiondas Decimal adjust for subtraction

– Appendices I has more details on these instructions Usage

» Main use is in performing arithmetic operations on BCD numbers

2003




• Parity flag Indicates even parity of the low 8 bits of the result

– PF is set if the lower 8 bits contain even number 1 bits– For 16- and 32-bit values, only the least significant 8 bits

are considered for computing parity value Example

mov AL,53 53D = 0011 0101Badd AL,89 89D = 0101 1001B 142D = 1000 1110B» As the result has even number of 1 bits, parity flag is set

Related instructionsjp jump on even parity (jump if PF = 1)jnp jump on odd parity (jump if PF = 0)

2003




Usage of parity flag» Useful in writing data encoding programs» Example: Encodes the byte in AL (MSB is the parity bit)

parity_encode PROC shl AL jp parity_zero stc ; CF = 1 jmp move_parity_bit parity_zero: clc ; CF = 0 move_parity_bit: rcr ALparity_encode ENDP

2003



Arithmetic Instructions

• Pentium provides several arithmetic instructions that operate on 8-, 16- and 32-bit operands

» Addition: add, adc, inc

» Subtraction: sub, sbb, dec, neg, cmp

» Multiplication: mul, imul

» Division: div, idiv

» Related instructions: cbw, cwd, cdq, cwde, movsx, movzx

There are few other instructions such as aaa, aas, etc. that operate on decimal numbers

» See Appendix I for details

2003




• Multiplication More complicated than add/sub

» Produces double-length results

– E.g. Multiplying two 8 bit numbers produces a 16-bit result

» Cannot use a single multiply instruction for signed and unsigned numbers– add and sub instructions work both on signed and

unsigned numbers

– For multiplication, we need separate instructions

mul for unsigned numbers

imul for signed numbers

2003




• Unsigned multiplicationmul source» Depending on the source operand size, the location of the

other source operand and destination are selected

2003




Examplemov AL,10mov DL,25mul DL

produces 250D in AX register (result fits in AL)

• The imul instruction can use the same syntax» Also supports other formats

Examplemov DL,0FFH ; DL = -1mov AL,0BEH ; AL = -66imul DL

produces 66D in AX register (again, result fits in AL)

2003




• Division instruction Even more complicated than multiplication

» Produces two results– Quotient– Remainder

» In multiplication, using a double-length register, there will not be any overflow

– In division, divide overflow is possiblePentium provides a special software interrupt when a

divide overflow occurs

Two instructions as in multiplicationdiv source for unsigned numbers

idiv source for signed numbers

2003




• Dividend is twice the size of the divisor

• Dividend is assumed to be in AX (8-bit divisor) DX:AX (16-bit divisor) EDX:EAX (32-bit divisor)

2003




2003




• Examplemov AX,251mov CL,12div CL

produces 20D in AL and 11D as remainder in AH

• Examplesub DX,DX ; clear DX

mov AX,141BH ; AX = 5147Dmov CX,012CH ; CX = 300Ddiv CX

produces 17D in AX and 47D as remainder in DX

2003




• Signed division requires some help» We extended an unsigned 16 bit number to 32 bits by placing

zeros in the upper 16 bits

» This will not work for signed numbers

– To extend signed numbers, you have to copy the sign bit into those upper bit positions

Pentium provides three instructions in aiding sign extension

» All three take no operands

cbw converts byte to word (extends AL into AH)

cwd converts word to doubleword (extends AX into DX)

cdq converts doubleword to quadword (extends EAX into EDX)

2003




Some additional related instructions

» Sign extensioncwde converts word to doubleword (extends AX into EAX)

» Two move instructionsmovsx dest,src (move sign-extended src to dest)

movzx dest,src (move zero-extended src to dest)

» For both move instructions, dest has to be a register

» The src operand can be in a register or memory– If src is 8-bits, dest must be either a 16- or 32-bit

register

– If src is 16-bits, dest must be a 32-bit register

2003




• Examplemov AL,-95cbw ; AH = FFHmov CL,12idiv CL

produces 7D in AL and 11D as remainder in AH

• Examplemov AX,-5147cwd ; DX := FFFFHmov CX,300idiv CX

produces 17D in AX and 47D as remainder in DX

2003




• Use of Shifts for Multiplication and Division Shifts are more efficient Example: Multiply AX by 32

mov CX,32

imul CX

takes 12 clock cycles

Using

sal AX,5

takes just one clock cycle

2003



Application Examples

• PutInt8 procedure To display a number, repeatedly divide it by 10 and

display the remainders obtainedquotient remainder

108/10 10 8

10/10 1 0

1/10 0 1

To display digits, they must be converted to their character form

» This means simply adding the ASCII code for zero

line 24: add AH,’0’

2003



Application Examples (cont’d)

• GetInt8 procedure To read a number, read each digit character

» Convert to its numeric equivalent

» Multiply the running total by 10 and add this digit

Input digit Numericvalue (N)

Number := Number*10 + N

Initial value -- 0‘1’ 1 0 * 10 + 1 = 1‘5’ 5 1 * 10 + 5 = 15‘8’ 8 15 * 10 + 8 = 158

2003



Indirect Jumps

• Direct jump Target address is encoded in the instruction itself

• Indirect jump Introduces a level of indirection

» Address is specified either through memory of a general-purpose register

Example

jmp CXjumps to the address in CX

Address is absolute» Not relative as in direct jumps

2003



Indirect Jumps (cont’d)

Switch (ch) {

Case ’0’:

count[0]++; break;

Case ’1’:

count[1]++; break;

Case ’2’:

count[2]++; break;

Case ’3’:

count[3]++; break;

Default:

count[3]++;

}

2003




Turbo C assembly code for the switch statement

_main PROC NEAR

. . .

mov AL,ch

cbw

sub AX,48 ; 48 = ASCII for 0

mov BX,AX

cmp BX,3

ja default

shl BX,1 ; BX = BX * 2

jmp WORD PTR CS:jump_table[BX]

Indirect jump

2003




case_0: inc WORD PTR [BP-10]

jmp SHORT end_switch







default: inc WORD PTR [BP-2]

end_switch:

. . .

_main ENDP

2003




jump_table LABEL WORD

DW case_0

DW case_1

DW case_2

DW case_3

. . .

• Indirect jump uses this table to jump to the appropriate case routine

• The indirect jump instruction uses segment override prefix to refer to the jump_table in the CODE segment

Jump table for the indirect jump

2003



Conditional Jumps

• Three types of conditional jumps Jumps based on the value of a single flag

» Arithmetic flags such as zero, carry can be tested using these instructions

Jumps based on unsigned comparisons» Operands of cmp instruction are treated as unsigned

numbers

Jumps based on signed comparisons» Operands of cmp instruction are treated as signed numbers

2003



Jumps Based on Single Flags

Testing for zerojz jump if zero jumps if ZF = 1

je jump if equal jumps if ZF = 1

jnz jump if not zero jumps if ZF = 0

jne jump if not equal jumps if ZF = 0

jcxz jump if CX = 0 jumps if CX = 0

(Flags are not tested)

2003



Jumps Based on Single Flags (cont’d)

Testing for carryjc jump if carry jumps if CF = 1

jnc jump if no carry jumps if CF = 0

Testing for overflowjo jump if overflow jumps if OF = 1

jno jump if no overflow jumps if OF = 0

Testing for signjs jump if negative jumps if SF = 1

jns jump if not negative jumps if SF = 0

2003



Jumps Based on Single Flags (cont’d)

Testing for parityjp jump if parity jumps if PF = 1

jpe jump if parity jumps if PF = 1is even

jnp jump if not parity jumps if PF = 0

jpo jump if parity jumps if PF = 0is odd

2003



Jumps Based on Unsigned Comparisons

Mnemonic Meaning Conditionje jump if equal ZF = 1jz jump if zero ZF = 1

jne jump if not equal ZF = 0jnz jump if not zero ZF = 0

ja jump if above CF = ZF = 0jnbe jump if not below CF = ZF = 0

or equal

2003



Jumps Based on Unsigned Comparisons

Mnemonic Meaning Conditionjae jump if above CF = 0

or equaljnb jump if not below CF = 0

jb jump if below CF = 1jnae jump if not above CF = 1

or equal

jbe jump if below CF=1 or ZF=1or equal

jna jump if not above CF=1 or ZF=1

2003



Jumps Based on Signed Comparisons

Mnemonic Meaning Conditionje jump if equal ZF = 1jz jump if zero ZF = 1

jne jump if not equal ZF = 0jnz jump if not zero ZF = 0

jg jump if greater ZF=0 & SF=OFjnle jump if not less ZF=0 & SF=OF

or equal

2003



Jumps Based on Signed Comparisons (cont’d)

Mnemonic Meaning Condition

jge jump if greater SF = OF or equal

jnl jump if not less SF = OF

jl jump if less SF OFjnge jump if not greater SF OF

or equal

jle jump if less ZF=1 or SF OF or equal

jng jump if not greater ZF=1 or SF OF

2003



Implementing HLL Decision Structures

• High-level language decision structures can be implemented in a straightforward way

• See Section 12.4 for examples that implement if-then-else if-then-else with a relational operator if-then-else with logical operator AND if-then-else with logical operator OR while loop repeat-until loop for loops

2003



Logical Expressions in HLLs

• Representation of Boolean data Only a single bit is needed to represent Boolean data Usually a single byte is used

» For example, in C

– All zero bits represents false

– A non-zero value represents true

• Logical expressions Logical instructions AND, OR, etc. are used

• Bit manipulation Logical, shift, and rotate instructions are used

2003



Evaluation of Logical Expressions

• Two basic ways Full evaluation

» Entire expression is evaluated before assigning a value

» PASCAL uses full evaluation

Partial evaluation» Assigns as soon as the final outcome is known without blindly

evaluating the entire logical expression

» Two rules help:– cond1 AND cond2

If cond1 is false, no need to evaluate cond2– cond1 OR cond2

If cond1 is true, no need to evaluate cond2

2003



Evaluation of Logical Expressions (cont’d)

• Partial evaluation Used by C

• Useful in certain cases to avoid run-time errors• Example

if ((X > 0) AND (Y/X > 100))

If x is 0, full evaluation results in divide error Partial evaluation will not evaluate (Y/X > 100) if X = 0

• Partial evaluation is used to test if a pointer value is NULL before accessing the data it points to

2003



Bit Instructions

• Bit Test and Modify Instructions Four bit test instructions Each takes the position of the bit to be tested

Instruction Effect on the selected bit

bt (Bit Test) No effect

bts (Bit Test and Set) selected bit 1

btr (Bit Test and Reset) selected bit 0

btc selected bit NOT(selected bit)

(Bit Test and Complement)

2003



Bit Instructions (cont’d)

• All four instructions have the same format

• We use bt to illustrate the format

bt operand,bit_pos operand is word or doubleword

» Can be in a register or memory

bit_pos indicates the position of the bit to be tested

» Can be an immediate value or in a 16/32-bit register

• Instructions in this group affect only the carry flag» Other five flags are undefined

2003



Bit Scan Instructions

• These instructions scan the operand for a 1 bit return the bit position in a register

• Two instructionsbsf dest_reg,operand ;bit scan forward

bsr dest_reg,operand ;bit scan reverse» operand can be a word or doubleword in a register or

memory» dest_reg receives the bit position

– Must be a 16- or 32-bit register

Only ZF is updated (other five flags undefined)– ZF = 1 if all bits of operand are 0

– ZF = 0 otherwise (position of first 1 bit in dest_reg)

2003




• Example 1 Linear search of an integer array

• Example 2 Selection sort on an integer array

• Example 3 Multiplication using shift and add operations

» Multiplies two unsigned 8-bit numbers

– Uses a loop that iterates 8 times

• Example 4 Multiplication using bit instructions

2003



String Representation

• Two types Fixed-length Variable-length

• Fixed length strings Each string uses the same length

» Shorter strings are padded (e.g. by blank characters)

» Longer strings are truncated

Selection of string length is critical» Too large ==> inefficient

» Too small ==> truncation of larger strings

2003



String Representation (cont’d)

• Variable-length strings Avoids the pitfalls associated with fixed-length strings

• Two ways of representation Explicitly storing string length (used in PASCAL)

string DB ‘Error message’

str_len DW $-string– $ represents the current value of the location counter

$ points to the byte after the last character of string

Using a sentinel character (used in C)» Uses NULL character

– Such NULL-terminated strings are called ASCIIZ strings

2003



String Instructions

• Five string instructions

LODS LOaD String source

STOS STOre String destination

MOVS MOVe String source & destination

CMPS CoMPare String source & destination

SCAS SCAn String destination

• Specifying operands 32-bit segments:

DS:ESI = source operand ES:EDI = destination operand

16-bit segments:DS:SI = source operand ES:DI = destination operand

2003



String Instructions (cont’d)

• Each string instruction Can operate on 8-, 16-, or 32-bit operands Updates index register(s) automatically

» Byte operands: increment/decrement by 1» Word operands: increment/decrement by 2» Doubleword operands: increment/decrement by 4

• Direction flag DF = 0: Forward direction (increments index registers) DF = 1: Backward direction (decrements index registers)

• Two instructions to manipulate DFstd set direction flag (DF = 1)cld clear direction flag (DF = 0)

2003



Repetition Prefixes

• String instructions can be repeated by using a repetition prefix

• Two types Unconditional repetition

rep REPeat

Conditional repetitionrepe/repz REPeat while Equal

REPeat while Zero

repne/repnz REPeat while Not Equal

REPeat while Not Zero

2003



Repetition Prefixes (cont’d)

repwhile (CX 0)

execute the string instruction

CX := CX1end while

• CX register is first checked If zero, string instruction is not executed at all More like the JCXZ instruction

2003




repe/repzwhile (CX 0)


CX := CX1if (ZF = 0)

then

exit loop

end if

end while

• Useful with cmps and scas string instructions

2003




repne/repnz

while (CX 0)


CX := CX1if (ZF = 1)

then

exit loop

end if

end while

2003



String Move Instructions

• Three basic instructions movs, lods, and stos

Move a string (movs)• Format

movs dest_string,source_stringmovsb ; operands are bytesmovsw ; operands are wordsmovsd ; operands are doublewords

• First form is not used frequently

Source and destination pointed by DS:(E)SI and ES:(E)DI, respectively

2003



String Move Instructions (cont’d)

movsb --- move a byte stringES:DI := (DS:SI) ; copy a byteif (DF=0) ; forward direction then

SI := SI+1 DI := DI+1

else ; backward directionSI := SI1DI := DI1

end ifFlags affected: none

2003




Example.DATAstring1 DB 'The original string',0strLen EQU $ - string1string2 DB 80 DUP (?).CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen ; strLen includes NULL mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction rep movsb

2003




Load a String (LODS)• Copies the value from the source string at DS:

(E)SI to AL (lodsb) AX (lodsw) EAX (lodsd)

• Repetition prefix does not make sense It leaves only the last value in AL, AX, or EAX register

2003




lodsb --- load a byte stringAL := (DS:SI) ; copy a byte

if (DF=0) ; forward direction

thenSI := SI+1

else ; backward directionSI := SI1

end if

Flags affected: none

2003




Store a String (STOS)• Performs the complementary operation• Copies the value in

» AL (lodsb)» AX (lodsw) » EAX (lodsd)

to the destination string at ES:(E)DI

• Repetition prefix can be used to initialize a block of memory

2003




stosb --- store a byte stringES:DI := AL ; copy a byteif (DF=0) ; forward direction then

DI := DI+1 else ; backward direction

DI := DI1end if

Flags affected: none

2003




Example: Initializes array1 with -1.DATA

array1 DW 100 DUP (?)

.CODE

.STARTUP

mov AX,DS ; set up ES

mov ES,AX ; to the data segment

mov CX,100

mov DI,OFFSET array1

mov AX,-1

cld ; forward direction

rep stosw

2003




• In general, repeat prefixes are not useful with lods and stos

• Used in a loop to do conversions while copying mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward directionloop1: lodsb or AL,20H stosb loop loop1done:

2003



String Compare Instruction

cmpsb --- compare two byte stringsCompare two bytes at DS:SI and ES:DI and set flags

if (DF=0) ; forward direction then

SI := SI+1 DI := DI+1

else ; backward directionSI := SI1DI := DI1

end if

Flags affected: As per cmp instruction (DS:SI)(ES:DI)

2003



String Compare Instruction (cont’d)

.DATAstring1 DB 'abcdfghi',0strLen EQU $ - string1string2 DB 'abcdefgh',0.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 mov DI,OFFSET string2 cld ; forward direction repe cmpsb dec SI dec DI ; leaves SI & DI pointing to the last character that differs

2003



String Compare Instruction (cont’d)

.DATAstring1 DB 'abcdfghi',0strLen EQU $ - string1 - 1string2 DB 'abcdefgh',0.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov SI,OFFSET string1 + strLen - 1 mov DI,OFFSET string2 + strLen - 1 std ; backward direction repne cmpsb inc SI ; Leaves SI & DI pointing to the first character that matches inc DI ; in the backward direction

2003



String Scan Instruction

scasb --- Scan a byte stringCompare AL to the byte at ES:DI and set

flagsif (DF=0) ; forward direction then

DI := DI+1 else ; backward direction

DI := DI1end if

Flags affected: As per cmp instruction (DS:SI)-(ES:DI)• scasw uses AX and scasd uses EAX registers instead of

AL

2003



String Scan Instruction (cont’d)

.DATAstring1 DB 'abcdefgh',0strLen EQU $ - string1.CODE .STARTUP mov AX,DS ; set up ES mov ES,AX ; to the data segment mov CX,strLen mov DI,OFFSET string1 mov AL,'e' ; character to be searched cld ; forward direction repne scasb dec DI ; leaves DI pointing to e in string1

Example 1

2003



String Scan Instruction (cont’d)

.DATA

string1 DB ' abc',0

strLen EQU $ - string1

.CODE

.STARTUP

mov AX,DS ; set up ES

mov ES,AX ; to the data segment

mov CX,strLen

mov DI,OFFSET string1

mov AL,' ' ; character to be searched

cld ; forward direction

repe scasb

dec DI ; leaves DI pointing to the first non-blank character a

Example 2

2003




LDS and LES instructions• String pointer can be loaded into DS/SI or ES/DI

register pair by using lds or les instructions• Syntax

lds register,sourceles register,source

register should be a 16-bit register source is a pointer to a 32-bit memory operand

• register is typically SI in lds and DI in les

2003



Illustrative Examples (cont’d)

• Actions of lds and les

ldsregister := (source)

DS := (source+2)

lesregister := (source)

ES := (source+2)

• Pentium also supports lfs, lgs, and lss to load the other segment registers

2003



Illustrative Examples (cont’d)

• Seven popular string processing routines are given as examples in string.asm str_len

str_mov

str-cpy

str_cat

str_cmp

str_chr

str_cnv

Given in the text

2003



Indirect Procedure Call

• Direct procedure calls specify the offset of the first instruction of the called procedure

• In indirect procedure call, the offset is specified through memory or a register If BX contains pointer to the procedure, we can use

call BX

If the word in memory at target_proc_ptr contains the offset of the called procedure, we can use

call target_proc_ptr

• These are similar to direct and indirect jumpsLast slide