1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status...
-
Upload
kerry-marshall -
Category
Documents
-
view
217 -
download
0
Transcript of 1 ECE 371 Microprocessors Chapter 5 Microprocessor Assembly Language 2 Herbert G. Mayer, PSU Status...
1
ECE 371Microprocessors
Chapter 5Microprocessor Assembly
Language 2
Herbert G. Mayer, PSUStatus 10/2/2015
For use at CCUT Fall 2015
2
Syllabus
Motivation Integer Multiply Integer Divide Conditional Branch Loop Constructs Memory Access Call and Return Procedure PutDec Summary Appendix Bibliography
3
Motivation
In another handout about x86 assembly language we cover modules, character- and string-output, and writing assembler procedures
Here we cover integer arithmetic, loops, and develop a more complex program to output signed integer numbers
Since integer multiplication can generate results that are twice as long in bits as any of the source operands, the machine instructions for integer multiply –conversely for integer divide– must make special provisions for the length of operands
4
X86 Integer Multiply and Divide
5
Integer Multiply Our first project is 16-bit signed
integer multiplication To track all minute detail of the
result, including overflow, sign of the result, etc. we use the small x86 machine model, which uses 16-bit operands
In that model, the smallest negative integer is -32768, the largest is 32767
The same principles apply to the newer model with 64-bit precision
6
Integer Multiply Under Microsoft’s assembler the opcodes
are mul for unsigned, and imul for signed integer multiplication
One operand is the ax register; it is always implied
The other operand may be a memory location, or another register
A literal operand is not permitted in the small mode; i.e. on the 16-bit architecture version, ok on 32-bit
The result/product is in the register pair ax and dx
There exists also a byte-version of the multiply, in which case the implied operand is in al, the other operand is a byte memory location or a byte register, and the result/product is in ax
In the code sample below we multiply literal 10, moved into register bx, with the contents of the second, implied operand: register ax
7
Integer Multiply; integer multiplication on x86, small mode:; multiply literal 10 with contents of ax; ax holds a copy of memory location MAX
mov bx, 10 ; a literal is in bxmov ax, MAX ; signed word at location MAXimul bx ; product is in ax + dx
; hi order 16 bits in dx. . .
8
Integer Divide: cwd Just as the integer multiply creates a
signed integer double-word result in register pair ax and dx, the integer divide instruction assumes the numerator to be in the register pair ax and dx
But if the numerator happens to be a single precision operand, it will have to be explicitly extended
The denominator may be in a register or memory
To create a sign-extended double-register operand in the ax-dx pair from the single-precision operand in ax, the x86 architecture provides the convert-to-double instruction cwd
The cwd has no explicit operand Assumed operand is the value in ax, ax is
unchanged The sign of ax is extended into dx
9
Integer Divide: cwd; memory location B_word holds operand; that operand is copied i.e. moved into register ax; to be used as numerator in divide; but first convert single- to double-precision
mov ax, B_word ; signed word at B_word in ax
cwd ; convert word to double-word
; sign of ax extended in dx
; ditto with byte-sized operandsmov al, a_byte ; signed byte a_byte into
ahcbw ; convert byte to word
; now the numerator can be used as operand in divide. . .
10
Integer Divide
Integer divide needs 2 operands Numerator is in ax extended to dx double
word Other operand may be memory location or
register Opcode div is for unsigned and idiv for
signed integer division In example assume numerator to be in
memory location A_wd Denominator is at memory location B_wd Quotient ends up in ax, and remainder in
dx
11
Integer Divide
; signed integer divide on x86:; assume operands to be in locations A_wd and B_wd;
mov ax, A_wd ; signed word at A_wd in axcwd ; sign of A_wd 16 times in
dxidiv B_wd ; quotient A_wd/B_wd in ax
; remainder A_wd/B_wd in dx; flags set to see: negative?
12
Memory AccessOn the x86 Microprocessor
13
Memory Access Key components of any computer
architecture are the processor and memory Memory is referenced implicitly and
explicitly by instructions that read and write data to and from memory
Explicit accesses are called loads (for reading) and stores (for writing)
Assemblers provide explicit instructions for these operations
Implicit memory accesses occur in machine instructions whose operands may be memory cells
On RISC systems these implicit references generally do not exist; instead all memory traffic is exclusively funneled through loads and stores on RISC
14
Memory Access In an assembler program, memory locations
(both for data and code) are generally referred to symbolically
This improves readability and allows for relocation; i.e. the linker and loader have a certain degree of freedom of placement in physical memory
However, explicit memory addressing via a hard coded numbers is also possible; for example, on a hypothetical machine ld r1, 1000 could mean: load the word in memory location 1000 into register r1
Some assembly languages provide syntax to render the indirection explicit, for example the load operation:
ld r1, (1000) uses parentheses to allude to this indirection
15
Memory Access A common paradigm of referencing symbolic
memory names (labels) is through what is called indirectly. This means, the label value (memory address) is not what is wanted, but the contents of memory at that label location
For example, if the offset of the data name foo is 10000, then the operation ld r2, foo does generally not mean to load the value 10000 into a register
Instead, foo is used indirectly, the word at that address is referenced, loaded into register r2. When the address is really wanted, the IBM 370 architecture for example uses a special type of load, called load address, while the masm assembler for the x86 architecture uses the seg -or offset- operator to allude to the fact that indirection is not wanted
Instead, the segment register portion of the address -or the offset portion of the address- is wanted
16
Memory Access During indirect memory references it is
sometimes desirable to index Indexing means that one wishes to modify
an otherwise fixed memory address. Typically, such a modifier resides in a register
And if the value in that register is modified from iteration to iteration, the indexing operation can access memory in some sequential order, say in increasing (or decreasing) fashion
This access to sequential memory addresses in equal steps is known as stride. For example, if r2 is a register loaded with a value (say, 2) then the instruction ld r1, foo[r2] means: fetch the word which is located 2 bytes further in memory than the offset expressed by foo
Load that word into register r1
17
Memory Access
In addition to indexing through a register, many architectures (and thus their assemblers) allow the offset to be modified by an additional literal index
The literal value is encoded into the instruction, referred to as an immediate operand
Immediate values are usually small, since the architectures often provide just a few bits to hold it
On some architectures this immediate operand may be signed, on others only unsigned literal modifiers are possible
18
Memory Access
Memory holds the data being manipulated Also intermediate results must be stored
somewhere Registers usually are in short supply,
contrasted with the size of memory Before completing a computation, data must
be brought from peripherals to memory After computation, data must be sent from
memory to peripherals, e.g. printers Often a cache helps overcome the speed
bottleneck of memory accesses
19
Memory Access
Indexing on x86 Indirect memory references are the default
semantics on assemblers for the x86 architecture
On nasm and masm this can also be expressed explicitly via the [ ] operator pair
For example the move instruction --this mov is really a load:
data_seg1 segmentfoo dw -1999, 0, . . .data_seg1 ends. . .mov ax, foo ; indirection implied in masmmov ax, [foo] ; explicit indirection in masm
20
Memory Access
Indexing on x86 The above mov code loads the word at data
segment location foo into register ax, regardless of whether the [] operator is used
In the nasm assembler the instruction
mov ax, foo ; load offset of address in nasm
loads the address of the memory location into register ax, while the nasm mnemonic:
mov ax, [foo]; loads contents at address in nasm
loads the contents of memory location foo into ax; assembler differences can be very subtle!!
21
Memory Access
Indexing on x86 A handy programming tool that makes
indexing so convenient is the ability to modify address labels by registers, literals, or a combination of both
Clearly, the underlying computer architecture must support this, i.e. there must be instructions in place that allow index or multiply indexed load and store operations
Some architectures (including IBM 360 and x86) allows multiple registers to be used to modify (to index) the address label
These registers are referred to as base- and as index-registers
Note that the term base-register often means that the base address sits in that register
22
Memory Access
Indexing on x86 However, in the x86 architecture, as long
as an address expression includes a data memory label, that label is the base address
With the following provisos: If l1, l2 are address labels, and c1, c2 are numeric literal constants, then:
l1 + c1 ; is address of location l1 plus c1l1 – c2 ; is address of location l1 minus c2l1 – l2 ; is a pure numeric value: l1 – l2[l2 + c1] ; is the memory content at l2 + c1[l1 – c2] ; is the memory content at l1 – c2l1 + l2 ; is illegal on x86
23
Memory Access
Indexing on x86 On orthogonally designed architectures, a
user visible register is usable as an index (or base-) register
Practical limitations often forced compromises. For example, on the x86 architecture, only certain registers can be used for indexing, listed below:address expression + one of ( bx, bp, si, and di )
on x86 An address expression, being indexed by one
(even 2) of these index registers, may also contain a literal modifier, or both, making the indexing operation practical and easy to use for array indexing. Note that it is possible to use up to 2 index- and base-registers in a single address expression, but only with the following restriction:address expression + two of ( ( bx or bp, and ( si
or di ) ) on x86
24
Memory Access
Indexing on x86 An address expression such as
[min_data+bx+si+2] is allowed, while the expression [min_data+bx+bx] is not permitted due to multiple uses of the bx register
These samples assume that min_data is a legal label in the date segment
A complete expression with all typical arithmetic operators is allowed on x86 assemblers, as long as the resulting value is computable (and reducible to) a single numeric value at the time of assembly
Thus, an expression like [chars+bx+si+2*3+4] is legal, provided that chars is a legal data label
25
Memory Access
Implicit Segment Register Data declared in the data segment below
are digits hex ‘0’ .. ‘f’ The user-designed Put_Char macro uses
system service call 02h for single-character output
Using bx as index register Note that only base and index registers
can be used for this purpose, e.g. not cx Memory operands (data labels) are used
indirectly Indirection is explicitly expressed via
[ and ] operator But not necessarily needed for memory
operands in Microsoft SW, as indirection is most common case
Since it is needed in nasm and Unix systems, we recommend use of the [ ] operation
26
Memory Access
Implicit Segment Register Benefit is also improved readability to
use explicit brackets to allude to indirection, such as [chars]
Note that indirect offset and index register are both allowed
Either or both or none may be modified by an immediate operand
Immediate operand are limited to 16 bits in size
Order of offset and index arbitrary The output of program below is:
hm02012452267
27
Memory Access; Purpose: memory references, indexing; HM for use at CCUTstart macro ; no parameters
mov ax, @data ; @data predefined macromov ds, ax ; now data segment reg setendm ; end macro: start
terminmacro ret_code ; no parameters, assume 0mov ah, Term_Code ; we wanna terminate, ah + almov al, ret_code ; any errors? If /= 0int 21h ; call DOS for helpendm ; end macro: termin
Put_Char macro char ; output passed charactermov ah, Cout_Code ; tell DOS: Char outmov dl, char ; char into required byte regint 21h ; and call DOSendm ; end macro Put_Char
Cout_Code = 2hTerm_Code = 4ch
.model small
.datachars db "0123456789abcdef"
28
Memory Access.code
main: startmov bx, 2 ; index char '2' in charsmov cl, 'h'Put_Char cl ; o.k. since cl holds charPut_Char 'm'Put_Char chars ; not good programmingPut_Char chars[bx] ; shows partial indirectionPut_Char [chars] ; explicitPut_Char [chars+1] ; explicitPut_Char [chars+bx]Put_Char chars[bx+2]Put_Char [chars+bx+3]Put_Char [bx][chars]Put_Char [chars]+[bx]Put_Char [bx+4][chars]Put_Char [bx+3][chars+2]
done: termin 0 ; no errors if we reach
29
Memory Access
Explicit Segment Register Again the data in the data segment is
character string are: “0123456789abcdef” Macros as in the example earlier Use bx as index register again Note: no implicit segment register used Instead, cs used explicitly Note syntax: seg:offset The output of program below is:
h02012452267
30
Memory Access
; Source file: mem2.asm; use explicit segment reg; Purpose: memory ref, indexing with explicit ds:
.model small
.data
chars db "0123456789abcdef“
.codemain: start
mov bx, 2 ; index '2' in chars
31
Memory Access
Put_Char 'h'Put_Char ds:chars ; not good programmingPut_Char ds:chars[bx] ; only partial indirectPut_Char ds:[chars] ; explicit Put_Char ds:[chars+1] ; explicitPut_Char ds:[chars+bx]Put_Char ds:chars[bx+2]Put_Char ds:[chars+bx+3]Put_Char ds:[bx][chars]Put_Char ds:[chars]+[bx]Put_Char ds:[bx+4][chars]Put_Char ds:[bx+3][chars+2]. . .
32
Word Access Goal: to reference memory as words Output these integers as decimal numbers Use the yet to be designed PutDec()
assembler procedure to print decimal numbers
Macros start and termin as before Use register bx again as index register Data segment defines some decimal and some
hex literals Data label nums defines an array of
integer words
33
Word Access
Observe that modifications to index register is done in steps of 2
Stride of word is 2 on x86! Note that words initialized via hex
literals are still printed as signed integers
Intended output shown below:
Output: 511 512 512 513 1025 -8531 -8531 -17730 -17730
34
Word Access; Purpose: word memory references, indexingstart macro ; no parameters
mov ax, @data ; @data predefined macromov ds, ax ; now data segment reg setendm ; end macro: start
terminmacro ret_code ; no parameters, assume 0mov ah, Term_Code ; terminate, ah + almov al, ret_code ; any errors? If /= 0int 21h ; call DOS for helpendm ; end macro: termin
Term_Code = 4ch.model small.data
nums dw 511, 512, 513, 1023, 1024, 1025w1 dw 0deadhw2 dw 0beefhw3 dw 0c0edhw4 dw 0babeh
35
Word Access, Cont’d.codeextrn PutDec : near
main: startmov bx, 2 ; use bx as index registermov ax, numscall PutDec ; output is: 511
mov ax, [nums + 2]call PutDec ; output is: 512
mov ax, [nums + bx]call PutDec ; output is: 512
mov ax, [nums][bx + 2]call PutDec ; output is: 513
mov ax, [nums+2][bx+6]call PutDec ; output is: 1025
36
Word Access, Cont’dmov nums, 0deadhmov ax, numscall PutDec ; output is: -8531
mov ax, w1call PutDec ; output is: -8531
mov ax, [w2+bx+2]call PutDec ; output is: -17730
mov ax, [w1+6]call PutDec ; output is: -17730
done: termin 0 ; no errors if we reach
end main ; start here!
37
Loop Constructs
38
Comparison By default, a machine executes one
instruction after another, in sequence That sequence can be changed via branches Branches are also known as jumps,
depending on manufacturer Unconditional branches transfer control to
their defined destination Conditional branches make this change in
control flow only if their associated condition holds
39
Comparison How does the microprocessor “know” when or
whether a condition is true? The CPU has flags that specify this
condition, and instructions that test for the condition
Typical conditions are zero, negative, positive, overflow, carry, etc.
Symbolic flags are CF, ZF, OF These can be used as operands in
conditional branches, conditional calls etc.
40
Conditional Branch-- high-level source program snippet if a > b then max := a; else max := b; end if;
; corresponding x86 assembler snippet: mov ax, [a] ; memory location a cmp ax, [b] ; memory location b jle b_is_max ; jump to b_is_max if mov [max], ax jmp end_if ; jump around elseb_is_max: ; this is else mov ax, [b] mov [max], axend_if: . . .
41
Loops Operations are performed repeatedly via
loops In higher level languages, loops are hand-
manufactured via conditions and branches (If Statement and Gotos) or using language defined structured loop statements
The latter include Repeat, While, and For Statements
We introduce the x86 loop instruction Generally a loop body is repeated until a
particular value (sentinel) is found A loop body entered unconditionally is
akin to a Repeat Statement
42
x86 Loop Another assembler example knows the
iteration count at the time of assembly, hence the x86 provided loop instruction can be used
A sample x86 loop instruction follows:loop next ; is executed: if --cx then goto next;
This loop body can be characterized as a For Statement
The third example does not know the number of iterations at the time of assembly. Hence, before entering the loop body the first time, a check must be made for the loop count to be = 0
If so, the body is bypassed; else the body is entered and executed countably many times. Thus, the loop resembles a C-style For Statement
43
x86 Loop We saw, loops allow the repeated operation
of their bodies Based on a condition, or based on a
defined number of steps, which in effect defines that condition
On the x86 architecture, the cx register functions as the counter for counted loops, with the loop opcode
On x86 the counted loop is executed by the loop instruction, assuming the loop count in cx
As long as cx is not 0, execution continues at the place of the loop label
Else execution continues at the next instruction after the loop opcode
During each execution of the loop opcode, the value in cx is decremented by 1
44
x86 Loop; demonstrate the x86 “loop” instruction; assumes count to be in cx; when loop is executed: decrement cx; once cx is 0, continue at instruction after loop; else branch to label
; place 10 into cx to define loop stepsmov cx, 10
again: ; a label! Note the colon :mov ax, cx ; print value in axcall PutDec ; via PutDec procedureloop again ; check, if need to loop more
; prints the numbers 10 down to 1, but NOT 0
45
First Loop We define a string in data segment, all
‘0’..’f’ digits The data area is named ‘chars’ and being
used as address (data offset) The sentinel for loop termination is ‘#’ Register bx used as index register Note that only bx, si, di, and bp can be
used for indexing on x86 Practice the cmp instruction, which
compares by subtracting, and then sets flags
Learn to know conditional (jcc) and unconditional jump (jmp)
See use of labels as destinations of jumps Output of program is:
0123456789abcdef
46
First Loop; Source file: loop1.asm; Purpose: use, syntax of indexing array w. sentinelStart macro ; no parameters
mov ax, @data ; @data predefined macromov ds, ax ; now data segment reg setendm ; end macro: start
Termin macro ret_code ; 1 parameter: return codemov ah, 4ch ; terminate: set ah + almov al, ret_code; any errors? If /= 0int 21h ; call sys sw for helpendm ; end macro: termin
Char_Out = 2hSentin = '#'
.model small
.datachars db "0123456789abcdef", Sentin
47
First Loop
.codemain: start
mov ah, Char_Out ; set up ah for sysmov bx, 0 ; to index string, init 0
next: mov dl, chars[bx] ; find next charinc bx ; increment index reg bxcmp dl, Sentin ; found sentinel?je done ; yep, so stopint 21h ; nop, so print itjmp next ; try next; could be sent
done: termin 0 ; no errors if we reach
end main ; start here!
48
Second Loop Again we define character string in data
segment, all ‘0’..’f’ hex digits This time we use no sentinel Assume that the loop is executed exactly
16 times, and is known a-priori, i.e. a countable loop
Again we use register bx as index register Learn loop instruction, which tracks loop
count and conditional branch Loop instruction on x86 subtracts 1 from
cx each time it is executed If cx = 0, fall through; else branch to
target, which is part of instruction Output of program is:
0123456789abcdef
49
Second Loop
; Source file: loop2.asm; Purpose: use, syntax of indexing char array; loop is "countable" we know # of elements; b 4 start of loop; we know at assembly time
. . . same macros start, termin
Char_Out = 2hNum_El = 10h ; 16 elements in chars array[]
.model small
.datachars db "0123456789abcdef"
50
Second Loop
.code ; abbreviationmain: start
mov ah, Char_Out ; set up ah for system callmov bx, 0 ; initial index off 'chars'mov cx, Num_El ; know # iterations a
priorinext: mov dl, chars[bx]; find next char
inc bx ; increment index registerint 21h ; print itloop next ; try next one; could be 0:
end
51
Third Loop Again we define a character string in data
segment, all ‘0’..’f’ hex digits, no sentinel
Assume iteration count is not known a-priori
Again use register bx as index register Must check whether cx is less than or
equal to zero Caution: If cx were negative, this would
be bad news, as looping will be excessive! Goood that x86 provides a special opcode
jcxz Loop instruction on x86 subtracts 1 from
cx; should start with a positive value New instruction jcxz: if cx is already
zero at start, branch and don’t enter loop body
Output of program is:0123456789abcdef
52
Third Loop
; Source file: loop3.asm; Purpose: use, syntax of indexing char array
.model small
.datachars db "0123456789abcdef"
.codemain: start
mov ah, Char_Out; set up ah for DOS callmov bx, 0 ; initial index off 'chars‘
; assume that # read at run time; fake this reading by brute-force setting; but the point is: The # could be non-positive!
53
Third Loop
mov cx, 16 ; pretend we read value of cx
cmp cx, 0 ; then test if cx < 0jl done_neg ; if it is, jumpjcxz done_zero ; if it is zero, jump also
; if we reach this: cx is positive
next: mov dl, [chars][bx] ; find next charinc bx ; increment index registerint 21h ; output next characterloop next ; try next one; could be end
done: termin 0 ; no errors if we reachdone_neg: termin 1 ; another error code. Not 0done_zero: termin 2 ; an yet another error
end main ; start here!
54
X86 Call and Return
55
Call and Return High level programming requires logical
(and physical) modularization to render the overall programming task manageable
Key tool for logical modularization is the creation of procedures (in some languages called subroutines, functions, etc.) with their associated calls and returns
This section introduces calling and returning, also known as context switching
We’ll use the term procedure generically to mean procedure, function, or subroutine, unless the particular meaning is needed
56
Call and Return It is not feasible to express a complete
program as single procedure, when the program is large
Logical modules reduce complexity of programming task
This allows re-use and reincarnation of the same procedure through parameterization
A High Level language should hide the detail of call/return mechanism; not so in assembler
For example, the manipulation of the stack through push and pop operations should be hidden
However, some aspects of context switch should be reflected in High Level language, in particular the call and return
57
Call and Return Like in High-Level language programs,
procedures are a key syntax tool to modularize
Physical modules (procedures) encapsulate data and actions that belong together
Physical modules –delineated by the proc and endp keywords) are the language tool to define modules
Procedures can be called, via the call opcode, parameterized by the procedure name, e.g.:
call PutDec Procedures return, via the ret instruction If they return a result to the calling
environment, we refer to them as functions A return ends up at the instruction after
the call
58
Call and ReturnStack Frame Stack Pointer identifies top of current
stack, and also top of current Stack Frame Stack pointer may vary often during
invocation Stack pointer changes upon call, return,
push, pop, explicit assignments Base pointer does not vary during call Base pointer only set up once at start of
call Base pointer changed again at return, to
value of previous base pointer, dynamic link
Parameters can be addressed relative to base pointer in one direction
59
Call and ReturnStack Frame Locals (and temps) can be addressed
relative to base pointer in the other direction
Possible to save base pointer, useful when registers are scarce, as on x86
However, this scheme is difficult, since compiler (or human programmer) must keep dynamic size of stack in mind at any moment of time of code generation; not discussed here
60
Call and ReturnStack Frame
Stack Frame
Locals + Temps
Stack Marker
Actual Parameters
sp
bp
61
Call and ReturnBefore Call Push actual parameters: Changes the stack Track size of actual parameters pushed In most languages the actual size is
fixed; not so in C Base pointer still points to Stack Marker
of caller After last actual parameter pushed: one
flexible part of Stack Frame complete
62
Call and ReturnCall Push the instruction pointer (ip) The address of the instruction after the
call must be saved as return address This identifies the beginning of the Stack
Marker Set instruction pointer (ip, AKA pc) to
the address of the destination (callée) x86 architecture has 24 flavors of call
instructions
63
Call and ReturnProcedure Entry Push Base Pointer, this is the dynamic
link Set Base Pointer to the value of the Stack
Pointer Now the new Stack Frame is being addressed The fixed part of stack, the Stack Marker
is being built Allocate space for local variables, if any This establishes another area of the Stack
Frame that is variable in size
64
Call and ReturnReturn Pop locals and temps off stack This frees the second variable size area
from the Stack Frame Pop registers to be restored Pop the top of stack value into the Base
Pointer(bp) This uses the Dynamic Link to reactivate
the previous Stack Frame Pop top of stack value into instruction
pointer
65
Call and ReturnReturn This sets the ip register back to the
instruction after the call The return instruction does this! Either caller (or a suitable argument of
the return instruction) frees the space allocated for actual parameters
Note that the x86 architecture allows an argument to the ret instruction, freeing that amount of bytes off of the stack
66
Call and Return Code1a. Procedure Entry, No Locals, Save
Regs
push bp ; save dyn link in Stack Markermov bp, sp; establish new Frame: point to S.M.push ax ; save ax if needed by callee, opt.push bx ; ditto for bx
67
Call and Return Code1b. Procedure Exit, No Locals, Restore
Regs
pop bx ; restore bx if was used by calleepop ax ; ditto for axpop bp ; must find back old Stack Frameret args ; ip to instruction after call; free args
68
Call and Return Code2a. Procedure Entry, With Locals, No
Regs
push bp ; save dyn link in Stack Markermov bp, sp ; establish new Frame: point to S.M.sub sp, 24 ; allocate 24 bytes uninitialized
; space for locals
69
Call and Return Code2b. Procedure Exit, With Locals, No
Regs
mov sp, bp ; free all locals and tempspop bp ; must find old S.F., RA on topret args ; ip to instruction after call
; free args
70
Call and Return Code3a. Procedure Entry, With Locals, Save
Regs
push bp ; save dyn link in Stack Markermov bp, sp; establish new Frame: point to S.M.sub sp, 24; allocate 24 bytes uninitialized; space for localspush ax ; save ax if needed by callee, opt.push bx ; ditto for bx
71
Call and Return Code3b. Procedure Exit, With Locals,
Restore Regs
pop bx ; restore bx if was used by calleepop ax ; ditto for axmov sp, bp; free all locals and tempspop bp ; must find back old S.F., RA on topret args ; ip to instruction after call; free args
72
Call and ReturnRecursive Factorial in C
// source: fact.c. . .unsigned fact( unsigned arg ){ // fact
if ( arg <= 1 ) {return 1;
}else{return fact( arg - 1 ) * arg;
} //end if} //end fact
73
Call and ReturnRecursive Factorial in x86
; Source file: fact.asmpenter macro
push bpmov bp, sppush bxpush cxpush dxendm
pexit macro argspop dxpop cxpop bxpop bpret argsendm
Errcode= 4chMAX = 9d
.model small
.stack 100h
.dataarg dw 0
74
Call and Return
Recursive Factorial in x86.codeextrn uPutDec : near
; assume arg on tos; return fact( int arg ) in ax
rfact procpentermov ax, [bp+4] ; arg 4 bytes b4 dyn linkcmp ax, 1 ; argument > 1?jg recurse ; if so: recusive call
base: mov ax, 1 ; No: then 0!=1!=1pexit 2 ; done, free 2 bytes = arg
recurse:mov ax, [bp+4] ; recurse; get next argdec ax ; but decrement firstpush ax ; and pass on stackcall rfact ; recurse!mov cx, [bp+4] ; product in ax, * argmul cx ; product in axpexit 2 ; and done
rfact endp
75
Call and ReturnRecursive Factorial in x86
drive_r procmov arg, 0 ; initial memorymov ax, 0 ; initial value againmov bp, sp ; no space for locals needed
again_r:cmp arg, MAXjge done_r
; ax holds argument to be factorialized :-)push ax ; argument on stackcall rfact
; now ax holds factorial valuecall uPutDec ; print next resultinc arg ; compute next fact(arg)mov ax, arg ; pass in axjmp again_r
done_r: retdrive_r endp
76
Design Asm Procedure PutDec
77
Design PutDecGoal Definition Design an assembly language procedure,
which prints a passed integer value in decimal notation
Values are passed in a machine register Values may be positive or negative Use x86 small arithmetic, i.e. 16-bit
integer precision, to easily track overflow, minimum and maximum integer values
We’ll proceed stepwise:
1. Printing a character
2. Printing a decimal digit, given an integer value 0..9
3. Finally printing the complete integer
78
Design PutDecDefine Macro Put_Ch to print one
character
; character is passed in dl; fiddle with ax, dx; restore before finishing
Put_Ch macro char ; 'char' is char 2 b printed push ax ; save ax push dx ; ditto for dx; use only dl mov dl, char ; move into formal parameter mov ah, 02h ; tell system SW whom to call int 021h ; call system SW, e.g. DOS pop dx ; restore pop ax ; ditto endm
79
Design PutDec
Print integer value 0..9 in dl as a character
; assume integer 0..9 to be in dl; convert to ASCII character; simple: just add ‘0’;p_char: add dl, '0’ ; convert int to char Put_Ch dl ; previously defined macro
80
Design PutDecPrint rightmost digit of number in ax
in decimal
; ax holds non-negative integer value; but is a binary number, i.e. binary 0..9; need ASCII mov bx, 10 ; base 10 is in bx sub dx, dx ; make ax-dx a double word div bx ; unsigned divide ax by 10; remainder is in dx; known to be < 10, so dl holds it add dl, '0' ; make int a printable char Put_Ch dl ; print that char
81
Asm SourceFor
Procedure PutDec
82
PutDec Asm Code: Macros; Purpose: print various signed 16-bit numbersstart macro mov ax, @data ; typical for MS system SW mov ds, ax endm
finish macro ; also MS system SW mov ax, 4c00h int 21h endm
Put_Ch macro char ; 'char' char is printed push ax ; save cos ax is overwritten push dx ; ditto for dx mov dl, char ; move character into parameter mov ah, 02h ; tell DOS who int 021h ; call DOS pop dx ; restore pop ax ; ditto endm
83
PutDec Asm Code: MacrosPut_Str macro str_addr ; print string at 'str_addr' push ax ; save push dx ; save mov dx, offset str_addr mov ah, 09h ; DOS proc id int 021h ; call DOS pop dx ; restore pop ax ; ditto endmbase_10 = 10 .model small .stack 500 .datamin_num db '-32768$' ; end strings with ‘$’num_is db 'the number is: $'cr_lf db 10, 13, '$' ; magic numbers for lf, cr
84
PutDec Asm Code: Body .code; ax value printed as a decimal integer numberPutDec proc ; special case -32768 cannot be negated cmp ax, -32768 ; is it special case? jne do_work ; nop, so do your real job Put_Str min_num ; yep: so print it and be done ret ; done. Printed -32768do_work: ; ax NOT -32768; is negative? push ax push bx push cx push dx cmp ax, 0 ; negative number? jge positive ; if not, invert sign, print - neg ax ; here the inversion Put_Ch '-' ; but first print - signpositive: sub cx, cx ; cx counts steps = # digits mov bx, base_10 ; divisor is 10 ; now we know number in ax is non-negative
85
PutDec Asm Code: Body; continue with non-negative numberpush_m: sub dx, dx ; make a double word div bx ; unsigned divide o.k. add dl, '0' ; make number a char push dx ; save; order reversed inc cx ; count steps cmp ax, 0 ; finally done? jne push_m ; if not, do next step ; now all chars are stored on stack in l-2-r orderpop_m: pop dx ; pop to dx; dl of interest Put_Ch dl ; print it as char loop pop_m ; more work? If so, do againdone: pop dx ; restore what you messed up pop cx ; ditto pop bx pop ax ret ; return to callerPutDec endp
86
PutDec Asm Code: Driver; output readable string. Print #, carriage-return;next_n proc put_str num_is ; print message call putdec ; print # put_str cr_lf ; cr lf retnext_n endp ; repeat label before endp
num macro val ; just to practice macros mov ax, val ; PutDec expects # in ax call next_n ; message, print #, cr lf endm
87
PutDec Asm Code: Mainmain proc ; entry point under Windows start ; set up for OS; exercise all kinds of cases, including corner cases num -32768 ; all macro expansions num -32767 ; ditto num 32767 ; put this # into ax num 100 num 1 num -1 num 0 num 0ffh finishmain endp
end main ; this IDs the entry point; can be different name
88
Appendix:Some Definitions
89
Definitions
Activation Record Synonym for Stack Marker
90
Definitions
Base Address Memory address of an aggregate area Usually a segment- AKA base-register is
used to hold a base address Addressing can then proceed relative to
such a base address
91
Definitions
Base Pointer An address pointer (often implemented via
a dedicated register), that identifies an agreed-upon area in the Stack Frame of an executing procedure
On the x86 architecture this is implemented via the dedicated bp register
92
Definitions
Binding Procedures may have parameters Formal parameters express attributes such
as type, name, and similar attributes At the place of call, these formal
parameters receive initial, actual values through so-called actual parameters
Sometimes, an actual parameter is solely the address of the true object referenced during the call
The association of actual t formal parameter is referred to as parameter binding
93
Definitions
Branch Transfer of control to a destination that
is generally not the instruction following the branch
Synonym: Jump. The destination is an explicit or implicit operand of the branch instruction
94
Definitions
Call Transfer of control (a.k.a. context
switch) to the operand of the call instruction
A call expects that after completion, the program resumes execution at the place after the call instruction
95
Definitions
Countable Loop Loop, in which the number of iterations
can be computed (is known) before the loop body starts
Thus the loop body must include code to change the remaining loop count
And includes a check to test, whether the final count has been reached
96
Definitions
Dynamic Link Location in the Stack Marker pointing to
the Stack Frame of the calling procedure This caller is temporarily dormant; i.e.
it is the callee’s stack frame that is active
Since the caller also has a Dynamic Link object, all currently yet incomplete Stack Frames are linked together via this data structure
97
Definitions
For Loop High-level construct implementing a
countable loop The x86 instruction is a key component to
write countable loops
98
Definitions
Frame Pointer Synonym for Base Pointer
99
Definitions
Hand-Manufactures Loop Most general type of loop: the number of
iterations cannot be computed before, not even during the execution of the loop
Generally, the number of iterations depends on data that are input via read operations
Also, the number of steps may depend on the precision of a computer (floating-point) result and thus is not known until the end
100
Definitions
Immediate Operand Operand encoded as part of the instruction No load is needed to get the immediate
value; instead, it is immediately available in the instruction proper
Since opcodes have a limited number of bits the size of immediate operands usually is limited to a fraction of a natural machine instruction—or word
101
Definitions
Load Operation to move (read) data from memory
to the processor Usually the destination is a register The source address is communicated in an
immediate operand, or in another register, or indirectly through a register
102
Definitions
Loop Body Program portion executed repeatedly This is the actual work to be
accomplished. The rest is loop overhead. Goal to minimize that overhead
103
Definitions
Offset A distance in memory units away from the
base address On a byte addressable microprocessor an
offset is a distance in units of bytes Offset is frequently defined as a distance
from a base registers, on x86 from a segment register
104
Definitions
Pop Stack operation that frees data from the
stack Often, the data popped off are assigned to
some other object Other times, data are just popped because
they are no longer needed, in which case only the stack space is freed
This can also be accomplished by changing the value of the stack pointer.
Often the memory location is not overwritten by a pop, i.e. the data just stay. But the memory areas is not considered to be part of the active stack anymore
105
Definitions
Push Stack operation that reserves temporary
space on the stack Generally, the space reserved on the stack
through a push is initialized with the argument of the push operation
Other times, a push just reserves space on the stack for data to be initialized at a later time
Note that on the x86 architecture a push decreases the top of stack pointer (sp value)
106
Definitions
Repeat Loop Loop in which the body is entered
unconditionally, and thus executed at least once
The number of iterations is generally not known until the loop terminates
The termination condition is computed logically at the physical end of the loop
107
Definitions
Return Transfer of control after completion of a
call Usually, this is accomplished through a
return instruction The return instruction assumes the return
address to be saved in a fixed part of the stack frame, called the Stack Marker
108
Definitions
Return Value The value returned by a function call If the return value is a composite data
structure, then the location set aside for the function return value is generally a pointer to the actual data
When no value is returned, we refer to the callée as a procedure
109
Definitions
Stack AKA runtime stack Run time data structure that grows and
shrinks during program execution It generally holds data (parameters,
locals, temps) and control information (return addresses, links)
Operations that change the stack include push, pop, call, return, and the like
110
Definitions
Stack Frame Run time data structure associated with an
active procedure or function A Stack Frame is composed of the procedure
parameters, the Stack Marker, local data, and space for temporary data, including saved registers
111
Definitions
Stack Marker Run time data structure on the stack
associated with a Stack Frame The Stack Marker holds fixed information,
whose structure is known a priori This includes the return address, the
static link, and the dynamic link In some implementations, the Stack Marker
also holds an entry for the function return value and the saved registers
112
Definitions
Stack Pointer AKA top of stack pointer A pointer (typically implemented via a
register) that addresses the last element allocated (pushed) on top of the stack
On the x86 architecture this is implemented via the sp register
It is also possible to have the Stack Pointer refer to the next free location (if any) on the stack in case another push operation needs stack space
113
Definitions
Static Link An address in the Stack Marker that points
to the Frame Pointer of the last invocation of the procedure which lexicographically surrounds the one executing currently
This is necessary only for high level languages that allow statically nested scopes, such as Ada, Algol, and Pascal
This is not needed in more restricted languages such as C or C++
114
Definitions
Store Operation to move data to memory Such moves are named: writes or stores Usually the source is a register, holding
the source address The target is a memory location, whose
address is held in some register Some architectures allow the target
address to be an immediate operand; not so on RISC architectures
115
Definitions
Stride Distance in number of bytes from one
element to next of same type For example, the stride of an integer
array on the x86 architecture is 2 for signed and unsigned words –note that x86 calls a unit of 2 bytes a word; most architectures have 4-byte words
It is 4 for double words on x86
116
Definitions
Top of Stack Stack location of the last allocated
(pushed) object
117
Definitions
While Loop Loop in which the body is entered after
first checking whether the condition for execution is true
If false, the body is not executed. This is also used as the termination criterion
The number of iterations is generally not known until the loop terminates
118
Bibliography
1. Jan’s Linux and Assembler:http://www.janw.easynet.be/eng.html
2. Webster Assembly Language:http://webster.cs.ucr.edu/
3. Nasm assembler under Unix:http://www.int80h.org/bsdasm/