1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status...

1

ECE 371Microprocessors

Chapter 5Microprocessor Assembly

Language 2

Herbert G. Mayer, PSUStatus 10/2/2015

For use at CCUT Fall 2015

2

Syllabus

Motivation Integer Multiply Integer Divide Conditional Branch Loop Constructs Memory Access Call and Return Procedure PutDec Summary Appendix Bibliography

3

Motivation

In another handout about x86 assembly language we cover modules, character- and string-output, and writing assembler procedures

Here we cover integer arithmetic, loops, and develop a more complex program to output signed integer numbers

Since integer multiplication can generate results that are twice as long in bits as any of the source operands, the machine instructions for integer multiply –conversely for integer divide– must make special provisions for the length of operands

4

X86 Integer Multiply and Divide

5

Integer Multiply Our first project is 16-bit signed

integer multiplication To track all minute detail of the

result, including overflow, sign of the result, etc. we use the small x86 machine model, which uses 16-bit operands

In that model, the smallest negative integer is -32768, the largest is 32767

The same principles apply to the newer model with 64-bit precision

6

Integer Multiply Under Microsoft’s assembler the opcodes

are mul for unsigned, and imul for signed integer multiplication

One operand is the ax register; it is always implied

The other operand may be a memory location, or another register

A literal operand is not permitted in the small mode; i.e. on the 16-bit architecture version, ok on 32-bit

The result/product is in the register pair ax and dx

There exists also a byte-version of the multiply, in which case the implied operand is in al, the other operand is a byte memory location or a byte register, and the result/product is in ax

In the code sample below we multiply literal 10, moved into register bx, with the contents of the second, implied operand: register ax

7

Integer Multiply; integer multiplication on x86, small mode:; multiply literal 10 with contents of ax; ax holds a copy of memory location MAX

mov bx, 10 ; a literal is in bxmov ax, MAX ; signed word at location MAXimul bx ; product is in ax + dx

; hi order 16 bits in dx. . .

8

Integer Divide: cwd Just as the integer multiply creates a

signed integer double-word result in register pair ax and dx, the integer divide instruction assumes the numerator to be in the register pair ax and dx

But if the numerator happens to be a single precision operand, it will have to be explicitly extended

The denominator may be in a register or memory

To create a sign-extended double-register operand in the ax-dx pair from the single-precision operand in ax, the x86 architecture provides the convert-to-double instruction cwd

The cwd has no explicit operand Assumed operand is the value in ax, ax is

unchanged The sign of ax is extended into dx

9

Integer Divide: cwd; memory location B_word holds operand; that operand is copied i.e. moved into register ax; to be used as numerator in divide; but first convert single- to double-precision

mov ax, B_word ; signed word at B_word in ax

cwd ; convert word to double-word

; sign of ax extended in dx

; ditto with byte-sized operandsmov al, a_byte ; signed byte a_byte into

ahcbw ; convert byte to word

; now the numerator can be used as operand in divide. . .

10

Integer Divide

Integer divide needs 2 operands Numerator is in ax extended to dx double

word Other operand may be memory location or

register Opcode div is for unsigned and idiv for

signed integer division In example assume numerator to be in

memory location A_wd Denominator is at memory location B_wd Quotient ends up in ax, and remainder in

dx

11

Integer Divide

; signed integer divide on x86:; assume operands to be in locations A_wd and B_wd;

mov ax, A_wd ; signed word at A_wd in axcwd ; sign of A_wd 16 times in

dxidiv B_wd ; quotient A_wd/B_wd in ax

; remainder A_wd/B_wd in dx; flags set to see: negative?

12

Memory AccessOn the x86 Microprocessor

13

Memory Access Key components of any computer

architecture are the processor and memory Memory is referenced implicitly and

explicitly by instructions that read and write data to and from memory

Explicit accesses are called loads (for reading) and stores (for writing)

Assemblers provide explicit instructions for these operations

Implicit memory accesses occur in machine instructions whose operands may be memory cells

On RISC systems these implicit references generally do not exist; instead all memory traffic is exclusively funneled through loads and stores on RISC

14

Memory Access In an assembler program, memory locations

(both for data and code) are generally referred to symbolically

This improves readability and allows for relocation; i.e. the linker and loader have a certain degree of freedom of placement in physical memory

However, explicit memory addressing via a hard coded numbers is also possible; for example, on a hypothetical machine ld r1, 1000 could mean: load the word in memory location 1000 into register r1

Some assembly languages provide syntax to render the indirection explicit, for example the load operation:

ld r1, (1000) uses parentheses to allude to this indirection

15

Memory Access A common paradigm of referencing symbolic

memory names (labels) is through what is called indirectly. This means, the label value (memory address) is not what is wanted, but the contents of memory at that label location

For example, if the offset of the data name foo is 10000, then the operation ld r2, foo does generally not mean to load the value 10000 into a register

Instead, foo is used indirectly, the word at that address is referenced, loaded into register r2. When the address is really wanted, the IBM 370 architecture for example uses a special type of load, called load address, while the masm assembler for the x86 architecture uses the seg -or offset- operator to allude to the fact that indirection is not wanted

Instead, the segment register portion of the address -or the offset portion of the address- is wanted

16

Memory Access During indirect memory references it is

sometimes desirable to index Indexing means that one wishes to modify

an otherwise fixed memory address. Typically, such a modifier resides in a register

And if the value in that register is modified from iteration to iteration, the indexing operation can access memory in some sequential order, say in increasing (or decreasing) fashion

This access to sequential memory addresses in equal steps is known as stride. For example, if r2 is a register loaded with a value (say, 2) then the instruction ld r1, foo[r2] means: fetch the word which is located 2 bytes further in memory than the offset expressed by foo

Load that word into register r1

17

Memory Access

In addition to indexing through a register, many architectures (and thus their assemblers) allow the offset to be modified by an additional literal index

The literal value is encoded into the instruction, referred to as an immediate operand

Immediate values are usually small, since the architectures often provide just a few bits to hold it

On some architectures this immediate operand may be signed, on others only unsigned literal modifiers are possible

18

Memory Access

Memory holds the data being manipulated Also intermediate results must be stored

somewhere Registers usually are in short supply,

contrasted with the size of memory Before completing a computation, data must

be brought from peripherals to memory After computation, data must be sent from

memory to peripherals, e.g. printers Often a cache helps overcome the speed

bottleneck of memory accesses

19

Memory Access

Indexing on x86 Indirect memory references are the default

semantics on assemblers for the x86 architecture

On nasm and masm this can also be expressed explicitly via the [ ] operator pair

For example the move instruction --this mov is really a load:

data_seg1 segmentfoo dw -1999, 0, . . .data_seg1 ends. . .mov ax, foo ; indirection implied in masmmov ax, [foo] ; explicit indirection in masm

20

Memory Access

Indexing on x86 The above mov code loads the word at data

segment location foo into register ax, regardless of whether the [] operator is used

In the nasm assembler the instruction

mov ax, foo ; load offset of address in nasm

loads the address of the memory location into register ax, while the nasm mnemonic:

mov ax, [foo]; loads contents at address in nasm

loads the contents of memory location foo into ax; assembler differences can be very subtle!!

21

Memory Access

Indexing on x86 A handy programming tool that makes

indexing so convenient is the ability to modify address labels by registers, literals, or a combination of both

Clearly, the underlying computer architecture must support this, i.e. there must be instructions in place that allow index or multiply indexed load and store operations

Some architectures (including IBM 360 and x86) allows multiple registers to be used to modify (to index) the address label

These registers are referred to as base- and as index-registers

Note that the term base-register often means that the base address sits in that register

22

Memory Access

Indexing on x86 However, in the x86 architecture, as long

as an address expression includes a data memory label, that label is the base address

With the following provisos: If l1, l2 are address labels, and c1, c2 are numeric literal constants, then:

l1 + c1 ; is address of location l1 plus c1l1 – c2 ; is address of location l1 minus c2l1 – l2 ; is a pure numeric value: l1 – l2[l2 + c1] ; is the memory content at l2 + c1[l1 – c2] ; is the memory content at l1 – c2l1 + l2 ; is illegal on x86

23

Memory Access

Indexing on x86 On orthogonally designed architectures, a

user visible register is usable as an index (or base-) register

Practical limitations often forced compromises. For example, on the x86 architecture, only certain registers can be used for indexing, listed below:address expression + one of ( bx, bp, si, and di )

on x86 An address expression, being indexed by one

(even 2) of these index registers, may also contain a literal modifier, or both, making the indexing operation practical and easy to use for array indexing. Note that it is possible to use up to 2 index- and base-registers in a single address expression, but only with the following restriction:address expression + two of ( ( bx or bp, and ( si

or di ) ) on x86

24

Memory Access

Indexing on x86 An address expression such as

[min_data+bx+si+2] is allowed, while the expression [min_data+bx+bx] is not permitted due to multiple uses of the bx register

These samples assume that min_data is a legal label in the date segment

A complete expression with all typical arithmetic operators is allowed on x86 assemblers, as long as the resulting value is computable (and reducible to) a single numeric value at the time of assembly

Thus, an expression like [chars+bx+si+2*3+4] is legal, provided that chars is a legal data label

25

Memory Access

Implicit Segment Register Data declared in the data segment below

are digits hex ‘0’ .. ‘f’ The user-designed Put_Char macro uses

system service call 02h for single-character output

Using bx as index register Note that only base and index registers

can be used for this purpose, e.g. not cx Memory operands (data labels) are used

indirectly Indirection is explicitly expressed via

[ and ] operator But not necessarily needed for memory

operands in Microsoft SW, as indirection is most common case

Since it is needed in nasm and Unix systems, we recommend use of the [ ] operation

26

Memory Access

Implicit Segment Register Benefit is also improved readability to

use explicit brackets to allude to indirection, such as [chars]

Note that indirect offset and index register are both allowed

Either or both or none may be modified by an immediate operand

Immediate operand are limited to 16 bits in size

Order of offset and index arbitrary The output of program below is:

hm02012452267

27

Memory Access; Purpose: memory references, indexing; HM for use at CCUTstart macro ; no parameters

mov ax, @data ; @data predefined macromov ds, ax ; now data segment reg setendm ; end macro: start

terminmacro ret_code ; no parameters, assume 0mov ah, Term_Code ; we wanna terminate, ah + almov al, ret_code ; any errors? If /= 0int 21h ; call DOS for helpendm ; end macro: termin

Put_Char macro char ; output passed charactermov ah, Cout_Code ; tell DOS: Char outmov dl, char ; char into required byte regint 21h ; and call DOSendm ; end macro Put_Char

Cout_Code = 2hTerm_Code = 4ch

.model small

.datachars db "0123456789abcdef"

28

Memory Access.code

main: startmov bx, 2 ; index char '2' in charsmov cl, 'h'Put_Char cl ; o.k. since cl holds charPut_Char 'm'Put_Char chars ; not good programmingPut_Char chars[bx] ; shows partial indirectionPut_Char [chars] ; explicitPut_Char [chars+1] ; explicitPut_Char [chars+bx]Put_Char chars[bx+2]Put_Char [chars+bx+3]Put_Char [bx][chars]Put_Char [chars]+[bx]Put_Char [bx+4][chars]Put_Char [bx+3][chars+2]

done: termin 0 ; no errors if we reach

29

Memory Access

Explicit Segment Register Again the data in the data segment is

character string are: “0123456789abcdef” Macros as in the example earlier Use bx as index register again Note: no implicit segment register used Instead, cs used explicitly Note syntax: seg:offset The output of program below is:

h02012452267

30

Memory Access

; Source file: mem2.asm; use explicit segment reg; Purpose: memory ref, indexing with explicit ds:

.model small

.data

chars db "0123456789abcdef“

.codemain: start

mov bx, 2 ; index '2' in chars

31

Memory Access

Put_Char 'h'Put_Char ds:chars ; not good programmingPut_Char ds:chars[bx] ; only partial indirectPut_Char ds:[chars] ; explicit Put_Char ds:[chars+1] ; explicitPut_Char ds:[chars+bx]Put_Char ds:chars[bx+2]Put_Char ds:[chars+bx+3]Put_Char ds:[bx][chars]Put_Char ds:[chars]+[bx]Put_Char ds:[bx+4][chars]Put_Char ds:[bx+3][chars+2]. . .

32

Word Access Goal: to reference memory as words Output these integers as decimal numbers Use the yet to be designed PutDec()

assembler procedure to print decimal numbers

Macros start and termin as before Use register bx again as index register Data segment defines some decimal and some

hex literals Data label nums defines an array of

integer words

33

Word Access

Observe that modifications to index register is done in steps of 2

Stride of word is 2 on x86! Note that words initialized via hex

literals are still printed as signed integers

Intended output shown below:

Output: 511 512 512 513 1025 -8531 -8531 -17730 -17730

34

Word Access; Purpose: word memory references, indexingstart macro ; no parameters


terminmacro ret_code ; no parameters, assume 0mov ah, Term_Code ; terminate, ah + almov al, ret_code ; any errors? If /= 0int 21h ; call DOS for helpendm ; end macro: termin

Term_Code = 4ch.model small.data

nums dw 511, 512, 513, 1023, 1024, 1025w1 dw 0deadhw2 dw 0beefhw3 dw 0c0edhw4 dw 0babeh

35

Word Access, Cont’d.codeextrn PutDec : near

main: startmov bx, 2 ; use bx as index registermov ax, numscall PutDec ; output is: 511

mov ax, [nums + 2]call PutDec ; output is: 512

mov ax, [nums + bx]call PutDec ; output is: 512

mov ax, [nums][bx + 2]call PutDec ; output is: 513

mov ax, [nums+2][bx+6]call PutDec ; output is: 1025

36

Word Access, Cont’dmov nums, 0deadhmov ax, numscall PutDec ; output is: -8531

mov ax, w1call PutDec ; output is: -8531

mov ax, [w2+bx+2]call PutDec ; output is: -17730

mov ax, [w1+6]call PutDec ; output is: -17730


end main ; start here!

37

Loop Constructs

38

Comparison By default, a machine executes one

instruction after another, in sequence That sequence can be changed via branches Branches are also known as jumps,

depending on manufacturer Unconditional branches transfer control to

their defined destination Conditional branches make this change in

control flow only if their associated condition holds

39

Comparison How does the microprocessor “know” when or

whether a condition is true? The CPU has flags that specify this

condition, and instructions that test for the condition

Typical conditions are zero, negative, positive, overflow, carry, etc.

Symbolic flags are CF, ZF, OF These can be used as operands in

conditional branches, conditional calls etc.

40

Conditional Branch-- high-level source program snippet if a > b then max := a; else max := b; end if;

; corresponding x86 assembler snippet: mov ax, [a] ; memory location a cmp ax, [b] ; memory location b jle b_is_max ; jump to b_is_max if mov [max], ax jmp end_if ; jump around elseb_is_max: ; this is else mov ax, [b] mov [max], axend_if: . . .

41

Loops Operations are performed repeatedly via

loops In higher level languages, loops are hand-

manufactured via conditions and branches (If Statement and Gotos) or using language defined structured loop statements

The latter include Repeat, While, and For Statements

We introduce the x86 loop instruction Generally a loop body is repeated until a

particular value (sentinel) is found A loop body entered unconditionally is

akin to a Repeat Statement

42

x86 Loop Another assembler example knows the

iteration count at the time of assembly, hence the x86 provided loop instruction can be used

A sample x86 loop instruction follows:loop next ; is executed: if --cx then goto next;

This loop body can be characterized as a For Statement

The third example does not know the number of iterations at the time of assembly. Hence, before entering the loop body the first time, a check must be made for the loop count to be = 0

If so, the body is bypassed; else the body is entered and executed countably many times. Thus, the loop resembles a C-style For Statement

43

x86 Loop We saw, loops allow the repeated operation

of their bodies Based on a condition, or based on a

defined number of steps, which in effect defines that condition

On the x86 architecture, the cx register functions as the counter for counted loops, with the loop opcode

On x86 the counted loop is executed by the loop instruction, assuming the loop count in cx

As long as cx is not 0, execution continues at the place of the loop label

Else execution continues at the next instruction after the loop opcode

During each execution of the loop opcode, the value in cx is decremented by 1

44

x86 Loop; demonstrate the x86 “loop” instruction; assumes count to be in cx; when loop is executed: decrement cx; once cx is 0, continue at instruction after loop; else branch to label

; place 10 into cx to define loop stepsmov cx, 10

again: ; a label! Note the colon :mov ax, cx ; print value in axcall PutDec ; via PutDec procedureloop again ; check, if need to loop more

; prints the numbers 10 down to 1, but NOT 0

45

First Loop We define a string in data segment, all

‘0’..’f’ digits The data area is named ‘chars’ and being

used as address (data offset) The sentinel for loop termination is ‘#’ Register bx used as index register Note that only bx, si, di, and bp can be

used for indexing on x86 Practice the cmp instruction, which

compares by subtracting, and then sets flags

Learn to know conditional (jcc) and unconditional jump (jmp)

See use of labels as destinations of jumps Output of program is:

0123456789abcdef

46

First Loop; Source file: loop1.asm; Purpose: use, syntax of indexing array w. sentinelStart macro ; no parameters


Termin macro ret_code ; 1 parameter: return codemov ah, 4ch ; terminate: set ah + almov al, ret_code; any errors? If /= 0int 21h ; call sys sw for helpendm ; end macro: termin

Char_Out = 2hSentin = '#'

.model small

.datachars db "0123456789abcdef", Sentin

47

First Loop

.codemain: start

mov ah, Char_Out ; set up ah for sysmov bx, 0 ; to index string, init 0

next: mov dl, chars[bx] ; find next charinc bx ; increment index reg bxcmp dl, Sentin ; found sentinel?je done ; yep, so stopint 21h ; nop, so print itjmp next ; try next; could be sent



48

Second Loop Again we define character string in data

segment, all ‘0’..’f’ hex digits This time we use no sentinel Assume that the loop is executed exactly

16 times, and is known a-priori, i.e. a countable loop

Again we use register bx as index register Learn loop instruction, which tracks loop

count and conditional branch Loop instruction on x86 subtracts 1 from

cx each time it is executed If cx = 0, fall through; else branch to

target, which is part of instruction Output of program is:

0123456789abcdef

49

Second Loop

; Source file: loop2.asm; Purpose: use, syntax of indexing char array; loop is "countable" we know # of elements; b 4 start of loop; we know at assembly time

. . . same macros start, termin

Char_Out = 2hNum_El = 10h ; 16 elements in chars array[]

.model small


50

Second Loop

.code ; abbreviationmain: start

mov ah, Char_Out ; set up ah for system callmov bx, 0 ; initial index off 'chars'mov cx, Num_El ; know # iterations a

priorinext: mov dl, chars[bx]; find next char

inc bx ; increment index registerint 21h ; print itloop next ; try next one; could be 0:

end

51

Third Loop Again we define a character string in data

segment, all ‘0’..’f’ hex digits, no sentinel

Assume iteration count is not known a-priori

Again use register bx as index register Must check whether cx is less than or

equal to zero Caution: If cx were negative, this would

be bad news, as looping will be excessive! Goood that x86 provides a special opcode

jcxz Loop instruction on x86 subtracts 1 from

cx; should start with a positive value New instruction jcxz: if cx is already

zero at start, branch and don’t enter loop body

Output of program is:0123456789abcdef

52

Third Loop

; Source file: loop3.asm; Purpose: use, syntax of indexing char array

.model small


.codemain: start

mov ah, Char_Out; set up ah for DOS callmov bx, 0 ; initial index off 'chars‘

; assume that # read at run time; fake this reading by brute-force setting; but the point is: The # could be non-positive!

53

Third Loop

mov cx, 16 ; pretend we read value of cx

cmp cx, 0 ; then test if cx < 0jl done_neg ; if it is, jumpjcxz done_zero ; if it is zero, jump also

; if we reach this: cx is positive

next: mov dl, [chars][bx] ; find next charinc bx ; increment index registerint 21h ; output next characterloop next ; try next one; could be end

done: termin 0 ; no errors if we reachdone_neg: termin 1 ; another error code. Not 0done_zero: termin 2 ; an yet another error


54

X86 Call and Return

55

Call and Return High level programming requires logical

(and physical) modularization to render the overall programming task manageable

Key tool for logical modularization is the creation of procedures (in some languages called subroutines, functions, etc.) with their associated calls and returns

This section introduces calling and returning, also known as context switching

We’ll use the term procedure generically to mean procedure, function, or subroutine, unless the particular meaning is needed

56

Call and Return It is not feasible to express a complete

program as single procedure, when the program is large

Logical modules reduce complexity of programming task

This allows re-use and reincarnation of the same procedure through parameterization

A High Level language should hide the detail of call/return mechanism; not so in assembler

For example, the manipulation of the stack through push and pop operations should be hidden

However, some aspects of context switch should be reflected in High Level language, in particular the call and return

57

Call and Return Like in High-Level language programs,

procedures are a key syntax tool to modularize

Physical modules (procedures) encapsulate data and actions that belong together

Physical modules –delineated by the proc and endp keywords) are the language tool to define modules

Procedures can be called, via the call opcode, parameterized by the procedure name, e.g.:

call PutDec Procedures return, via the ret instruction If they return a result to the calling

environment, we refer to them as functions A return ends up at the instruction after

the call

58

Call and ReturnStack Frame Stack Pointer identifies top of current

stack, and also top of current Stack Frame Stack pointer may vary often during

invocation Stack pointer changes upon call, return,

push, pop, explicit assignments Base pointer does not vary during call Base pointer only set up once at start of

call Base pointer changed again at return, to

value of previous base pointer, dynamic link

Parameters can be addressed relative to base pointer in one direction

59

Call and ReturnStack Frame Locals (and temps) can be addressed

relative to base pointer in the other direction

Possible to save base pointer, useful when registers are scarce, as on x86

However, this scheme is difficult, since compiler (or human programmer) must keep dynamic size of stack in mind at any moment of time of code generation; not discussed here

60

Call and ReturnStack Frame

Stack Frame

Locals + Temps

Stack Marker

Actual Parameters

sp

bp

61

Call and ReturnBefore Call Push actual parameters: Changes the stack Track size of actual parameters pushed In most languages the actual size is

fixed; not so in C Base pointer still points to Stack Marker

of caller After last actual parameter pushed: one

flexible part of Stack Frame complete

62

Call and ReturnCall Push the instruction pointer (ip) The address of the instruction after the

call must be saved as return address This identifies the beginning of the Stack

Marker Set instruction pointer (ip, AKA pc) to

the address of the destination (callée) x86 architecture has 24 flavors of call

instructions

63

Call and ReturnProcedure Entry Push Base Pointer, this is the dynamic

link Set Base Pointer to the value of the Stack

Pointer Now the new Stack Frame is being addressed The fixed part of stack, the Stack Marker

is being built Allocate space for local variables, if any This establishes another area of the Stack

Frame that is variable in size

64

Call and ReturnReturn Pop locals and temps off stack This frees the second variable size area

from the Stack Frame Pop registers to be restored Pop the top of stack value into the Base

Pointer(bp) This uses the Dynamic Link to reactivate

the previous Stack Frame Pop top of stack value into instruction

pointer

65

Call and ReturnReturn This sets the ip register back to the

instruction after the call The return instruction does this! Either caller (or a suitable argument of

the return instruction) frees the space allocated for actual parameters

Note that the x86 architecture allows an argument to the ret instruction, freeing that amount of bytes off of the stack

66

Call and Return Code1a. Procedure Entry, No Locals, Save

Regs

push bp ; save dyn link in Stack Markermov bp, sp; establish new Frame: point to S.M.push ax ; save ax if needed by callee, opt.push bx ; ditto for bx

67

Call and Return Code1b. Procedure Exit, No Locals, Restore

Regs

pop bx ; restore bx if was used by calleepop ax ; ditto for axpop bp ; must find back old Stack Frameret args ; ip to instruction after call; free args

68

Call and Return Code2a. Procedure Entry, With Locals, No

Regs

push bp ; save dyn link in Stack Markermov bp, sp ; establish new Frame: point to S.M.sub sp, 24 ; allocate 24 bytes uninitialized

; space for locals

69

Call and Return Code2b. Procedure Exit, With Locals, No

Regs

mov sp, bp ; free all locals and tempspop bp ; must find old S.F., RA on topret args ; ip to instruction after call

; free args

70

Call and Return Code3a. Procedure Entry, With Locals, Save

Regs

push bp ; save dyn link in Stack Markermov bp, sp; establish new Frame: point to S.M.sub sp, 24; allocate 24 bytes uninitialized; space for localspush ax ; save ax if needed by callee, opt.push bx ; ditto for bx

71

Call and Return Code3b. Procedure Exit, With Locals,

Restore Regs

pop bx ; restore bx if was used by calleepop ax ; ditto for axmov sp, bp; free all locals and tempspop bp ; must find back old S.F., RA on topret args ; ip to instruction after call; free args

72

Call and ReturnRecursive Factorial in C

// source: fact.c. . .unsigned fact( unsigned arg ){ // fact

if ( arg <= 1 ) {return 1;

}else{return fact( arg - 1 ) * arg;

} //end if} //end fact

73

Call and ReturnRecursive Factorial in x86

; Source file: fact.asmpenter macro

push bpmov bp, sppush bxpush cxpush dxendm

pexit macro argspop dxpop cxpop bxpop bpret argsendm

Errcode= 4chMAX = 9d

.model small

.stack 100h

.dataarg dw 0

74

Call and Return

Recursive Factorial in x86.codeextrn uPutDec : near

; assume arg on tos; return fact( int arg ) in ax

rfact procpentermov ax, [bp+4] ; arg 4 bytes b4 dyn linkcmp ax, 1 ; argument > 1?jg recurse ; if so: recusive call

base: mov ax, 1 ; No: then 0!=1!=1pexit 2 ; done, free 2 bytes = arg

recurse:mov ax, [bp+4] ; recurse; get next argdec ax ; but decrement firstpush ax ; and pass on stackcall rfact ; recurse!mov cx, [bp+4] ; product in ax, * argmul cx ; product in axpexit 2 ; and done

rfact endp

75

Call and ReturnRecursive Factorial in x86

drive_r procmov arg, 0 ; initial memorymov ax, 0 ; initial value againmov bp, sp ; no space for locals needed

again_r:cmp arg, MAXjge done_r

; ax holds argument to be factorialized :-)push ax ; argument on stackcall rfact

; now ax holds factorial valuecall uPutDec ; print next resultinc arg ; compute next fact(arg)mov ax, arg ; pass in axjmp again_r

done_r: retdrive_r endp

76

Design Asm Procedure PutDec

77

Design PutDecGoal Definition Design an assembly language procedure,

which prints a passed integer value in decimal notation

Values are passed in a machine register Values may be positive or negative Use x86 small arithmetic, i.e. 16-bit

integer precision, to easily track overflow, minimum and maximum integer values

We’ll proceed stepwise:

1. Printing a character

2. Printing a decimal digit, given an integer value 0..9

3. Finally printing the complete integer

78

Design PutDecDefine Macro Put_Ch to print one

character

; character is passed in dl; fiddle with ax, dx; restore before finishing

Put_Ch macro char ; 'char' is char 2 b printed push ax ; save ax push dx ; ditto for dx; use only dl mov dl, char ; move into formal parameter mov ah, 02h ; tell system SW whom to call int 021h ; call system SW, e.g. DOS pop dx ; restore pop ax ; ditto endm

79

Design PutDec

Print integer value 0..9 in dl as a character

; assume integer 0..9 to be in dl; convert to ASCII character; simple: just add ‘0’;p_char: add dl, '0’ ; convert int to char Put_Ch dl ; previously defined macro

80

Design PutDecPrint rightmost digit of number in ax

in decimal

; ax holds non-negative integer value; but is a binary number, i.e. binary 0..9; need ASCII mov bx, 10 ; base 10 is in bx sub dx, dx ; make ax-dx a double word div bx ; unsigned divide ax by 10; remainder is in dx; known to be < 10, so dl holds it add dl, '0' ; make int a printable char Put_Ch dl ; print that char

81

Asm SourceFor

Procedure PutDec

82

PutDec Asm Code: Macros; Purpose: print various signed 16-bit numbersstart macro mov ax, @data ; typical for MS system SW mov ds, ax endm

finish macro ; also MS system SW mov ax, 4c00h int 21h endm

Put_Ch macro char ; 'char' char is printed push ax ; save cos ax is overwritten push dx ; ditto for dx mov dl, char ; move character into parameter mov ah, 02h ; tell DOS who int 021h ; call DOS pop dx ; restore pop ax ; ditto endm

83

PutDec Asm Code: MacrosPut_Str macro str_addr ; print string at 'str_addr' push ax ; save push dx ; save mov dx, offset str_addr mov ah, 09h ; DOS proc id int 021h ; call DOS pop dx ; restore pop ax ; ditto endmbase_10 = 10 .model small .stack 500 .datamin_num db '-32768$' ; end strings with ‘$’num_is db 'the number is: $'cr_lf db 10, 13, '$' ; magic numbers for lf, cr

84

PutDec Asm Code: Body .code; ax value printed as a decimal integer numberPutDec proc ; special case -32768 cannot be negated cmp ax, -32768 ; is it special case? jne do_work ; nop, so do your real job Put_Str min_num ; yep: so print it and be done ret ; done. Printed -32768do_work: ; ax NOT -32768; is negative? push ax push bx push cx push dx cmp ax, 0 ; negative number? jge positive ; if not, invert sign, print - neg ax ; here the inversion Put_Ch '-' ; but first print - signpositive: sub cx, cx ; cx counts steps = # digits mov bx, base_10 ; divisor is 10 ; now we know number in ax is non-negative

85

PutDec Asm Code: Body; continue with non-negative numberpush_m: sub dx, dx ; make a double word div bx ; unsigned divide o.k. add dl, '0' ; make number a char push dx ; save; order reversed inc cx ; count steps cmp ax, 0 ; finally done? jne push_m ; if not, do next step ; now all chars are stored on stack in l-2-r orderpop_m: pop dx ; pop to dx; dl of interest Put_Ch dl ; print it as char loop pop_m ; more work? If so, do againdone: pop dx ; restore what you messed up pop cx ; ditto pop bx pop ax ret ; return to callerPutDec endp

86

PutDec Asm Code: Driver; output readable string. Print #, carriage-return;next_n proc put_str num_is ; print message call putdec ; print # put_str cr_lf ; cr lf retnext_n endp ; repeat label before endp

num macro val ; just to practice macros mov ax, val ; PutDec expects # in ax call next_n ; message, print #, cr lf endm

87

PutDec Asm Code: Mainmain proc ; entry point under Windows start ; set up for OS; exercise all kinds of cases, including corner cases num -32768 ; all macro expansions num -32767 ; ditto num 32767 ; put this # into ax num 100 num 1 num -1 num 0 num 0ffh finishmain endp

end main ; this IDs the entry point; can be different name

88

Appendix:Some Definitions

89

Definitions

Activation Record Synonym for Stack Marker

90

Definitions

Base Address Memory address of an aggregate area Usually a segment- AKA base-register is

used to hold a base address Addressing can then proceed relative to

such a base address

91

Definitions

Base Pointer An address pointer (often implemented via

a dedicated register), that identifies an agreed-upon area in the Stack Frame of an executing procedure

On the x86 architecture this is implemented via the dedicated bp register

92

Definitions

Binding Procedures may have parameters Formal parameters express attributes such

as type, name, and similar attributes At the place of call, these formal

parameters receive initial, actual values through so-called actual parameters

Sometimes, an actual parameter is solely the address of the true object referenced during the call

The association of actual t formal parameter is referred to as parameter binding

93

Definitions

Branch Transfer of control to a destination that

is generally not the instruction following the branch

Synonym: Jump. The destination is an explicit or implicit operand of the branch instruction

94

Definitions

Call Transfer of control (a.k.a. context

switch) to the operand of the call instruction

A call expects that after completion, the program resumes execution at the place after the call instruction

95

Definitions

Countable Loop Loop, in which the number of iterations

can be computed (is known) before the loop body starts

Thus the loop body must include code to change the remaining loop count

And includes a check to test, whether the final count has been reached

96

Definitions

Dynamic Link Location in the Stack Marker pointing to

the Stack Frame of the calling procedure This caller is temporarily dormant; i.e.

it is the callee’s stack frame that is active

Since the caller also has a Dynamic Link object, all currently yet incomplete Stack Frames are linked together via this data structure

97

Definitions

For Loop High-level construct implementing a

countable loop The x86 instruction is a key component to

write countable loops

98

Definitions

Frame Pointer Synonym for Base Pointer

99

Definitions

Hand-Manufactures Loop Most general type of loop: the number of

iterations cannot be computed before, not even during the execution of the loop

Generally, the number of iterations depends on data that are input via read operations

Also, the number of steps may depend on the precision of a computer (floating-point) result and thus is not known until the end

100

Definitions

Immediate Operand Operand encoded as part of the instruction No load is needed to get the immediate

value; instead, it is immediately available in the instruction proper

Since opcodes have a limited number of bits the size of immediate operands usually is limited to a fraction of a natural machine instruction—or word

101

Definitions

Load Operation to move (read) data from memory

to the processor Usually the destination is a register The source address is communicated in an

immediate operand, or in another register, or indirectly through a register

102

Definitions

Loop Body Program portion executed repeatedly This is the actual work to be

accomplished. The rest is loop overhead. Goal to minimize that overhead

103

Definitions

Offset A distance in memory units away from the

base address On a byte addressable microprocessor an

offset is a distance in units of bytes Offset is frequently defined as a distance

from a base registers, on x86 from a segment register

104

Definitions

Pop Stack operation that frees data from the

stack Often, the data popped off are assigned to

some other object Other times, data are just popped because

they are no longer needed, in which case only the stack space is freed

This can also be accomplished by changing the value of the stack pointer.

Often the memory location is not overwritten by a pop, i.e. the data just stay. But the memory areas is not considered to be part of the active stack anymore

105

Definitions

Push Stack operation that reserves temporary

space on the stack Generally, the space reserved on the stack

through a push is initialized with the argument of the push operation

Other times, a push just reserves space on the stack for data to be initialized at a later time

Note that on the x86 architecture a push decreases the top of stack pointer (sp value)

106

Definitions

Repeat Loop Loop in which the body is entered

unconditionally, and thus executed at least once

The number of iterations is generally not known until the loop terminates

The termination condition is computed logically at the physical end of the loop

107

Definitions

Return Transfer of control after completion of a

call Usually, this is accomplished through a

return instruction The return instruction assumes the return

address to be saved in a fixed part of the stack frame, called the Stack Marker

108

Definitions

Return Value The value returned by a function call If the return value is a composite data

structure, then the location set aside for the function return value is generally a pointer to the actual data

When no value is returned, we refer to the callée as a procedure

109

Definitions

Stack AKA runtime stack Run time data structure that grows and

shrinks during program execution It generally holds data (parameters,

locals, temps) and control information (return addresses, links)

Operations that change the stack include push, pop, call, return, and the like

110

Definitions

Stack Frame Run time data structure associated with an

active procedure or function A Stack Frame is composed of the procedure

parameters, the Stack Marker, local data, and space for temporary data, including saved registers

111

Definitions

Stack Marker Run time data structure on the stack

associated with a Stack Frame The Stack Marker holds fixed information,

whose structure is known a priori This includes the return address, the

static link, and the dynamic link In some implementations, the Stack Marker

also holds an entry for the function return value and the saved registers

112

Definitions

Stack Pointer AKA top of stack pointer A pointer (typically implemented via a

register) that addresses the last element allocated (pushed) on top of the stack

On the x86 architecture this is implemented via the sp register

It is also possible to have the Stack Pointer refer to the next free location (if any) on the stack in case another push operation needs stack space

113

Definitions

Static Link An address in the Stack Marker that points

to the Frame Pointer of the last invocation of the procedure which lexicographically surrounds the one executing currently

This is necessary only for high level languages that allow statically nested scopes, such as Ada, Algol, and Pascal

This is not needed in more restricted languages such as C or C++

114

Definitions

Store Operation to move data to memory Such moves are named: writes or stores Usually the source is a register, holding

the source address The target is a memory location, whose

address is held in some register Some architectures allow the target

address to be an immediate operand; not so on RISC architectures

115

Definitions

Stride Distance in number of bytes from one

element to next of same type For example, the stride of an integer

array on the x86 architecture is 2 for signed and unsigned words –note that x86 calls a unit of 2 bytes a word; most architectures have 4-byte words

It is 4 for double words on x86

116

Definitions

Top of Stack Stack location of the last allocated

(pushed) object

117

Definitions

While Loop Loop in which the body is entered after

first checking whether the condition for execution is true

If false, the body is not executed. This is also used as the termination criterion

The number of iterations is generally not known until the loop terminates

118

Bibliography

1. Jan’s Linux and Assembler:http://www.janw.easynet.be/eng.html

2. Webster Assembly Language:http://webster.cs.ucr.edu/

3. Nasm assembler under Unix:http://www.int80h.org/bsdasm/

1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status...

Documents

Transcript of 1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status...