Manipulating Information (2) Arithmetic Operations

69
1 Manipulating Information (2) Arithmetic Operations

description

Manipulating Information (2) Arithmetic Operations. Outline. Arithmetic Operations overflow Unsigned addition, multiplication Signed addition, negation, multiplication Using Shift to perform power-of-2 multiply Suggested reading Chap 2.3. • • •. • • •. u. Operands: w bits. • • •. - PowerPoint PPT Presentation

Transcript of Manipulating Information (2) Arithmetic Operations

Page 1: Manipulating Information (2) Arithmetic Operations

1

Manipulating Information (2)Arithmetic Operations

Page 2: Manipulating Information (2) Arithmetic Operations

2

Outline

• Arithmetic Operations– overflow– Unsigned addition, multiplication– Signed addition, negation, multiplication– Using Shift to perform power-of-2 multiply

• Suggested reading

– Chap 2.3

Page 3: Manipulating Information (2) Arithmetic Operations

3

Unsigned Addition

• • •

• • •

u

v+

• • •u + v

• • •

True Sum: w+1 bits

Operands: w bits

Discard Carry: w bits UAddw(u , v)

Page 4: Manipulating Information (2) Arithmetic Operations

4

Unsigned Addition

• Standard Addition Function

– Ignores carry output

• Implements Modular Arithmetic

– s = UAddw(u , v) = (u + v) mod 2w

Page 5: Manipulating Information (2) Arithmetic Operations

5

Unsigned Addition

Practice Problem 2.27Write a function with the following prototype:

/* Determine whether arguments can be added without overflow */

int uadd_ok(unsigned x, unsigned y);

This function should return 1 if arguments x and y can be added without causing overflow

Overflow iff (X+Y) < X

Page 6: Manipulating Information (2) Arithmetic Operations

6

Unsigned Addition

Page 7: Manipulating Information (2) Arithmetic Operations

7

Unsigned Addition Forms an Abelian Group

• Closed under addition

– 0   UAddw(u , v)    2w –1

• Commutative

– UAddw(u , v) = UAddw(v , u)

• Associative

– UAddw (t, UAddw (u,v)) = UAddw (UAddw (t, u ),

v)

Page 8: Manipulating Information (2) Arithmetic Operations

8

Unsigned Addition Forms an Abelian Group

• 0 is additive identity

– UAddw (u , 0)  =  u

• Every element has additive inverse

– Let UCompw (u )  = 2w – u

– UAddw(u , UCompw (u ))  =  0

Page 9: Manipulating Information (2) Arithmetic Operations

9

Unsigned Addition

Hex Decimal Decimal Hex058DF

xu4-x

Page 10: Manipulating Information (2) Arithmetic Operations

10

Signed Addition

• Functionality– True sum requires w+1 bits– Drop off MSB– Treat remaining bits as 2’s comp. integer

)(,2

,

)(,2

),(

NegOverTMinvuvu

TMaxvuTMinvu

PosOvervuTMaxvu

vuTadd

ww

ww

ww

Page 11: Manipulating Information (2) Arithmetic Operations

11

Signed Addition

Page 12: Manipulating Information (2) Arithmetic Operations

12

Signed Addition

Page 13: Manipulating Information (2) Arithmetic Operations

13

Signed Addition

Page 14: Manipulating Information (2) Arithmetic Operations

14

Detecting Tadd Overflow

• Task– Given s = TAddw(u , v)

– Determine if s = Addw(u , v)

• Claim– Overflow iff either:

• u, v < 0, s 0 (NegOver)• u, v 0, s < 0 (PosOver)

– ovf = (u<0 == v<0) && (u<0 != s<0);

Page 15: Manipulating Information (2) Arithmetic Operations

15

Mathematical Properties of TAdd

• Two’s Complement Under TAdd Forms a Group– Closed, Commutative, Associative, 0 is

additive identity– Every element has additive inverse

• Let

• TAddw(u , TCompw (u ))  =  0

TCompw(u) u uTMinwTMinw uTMinw

Page 16: Manipulating Information (2) Arithmetic Operations

16

Detecting Tadd Overflow

/* Determine whether arguments can be added without overflow */

/* WARNING: This code is buggy. */

int tadd_ok(int x, int y) {

int sum = x+y;

return (sum-x == y) && (sum-y == x);

}

Page 17: Manipulating Information (2) Arithmetic Operations

17

Detecting Tadd Overflow

/* Determine whether arguments can be subtracted without overflow */

/* WARNING: This code is buggy. */

int tsub_ok(int x, int y) {

return tadd_ok(x, -y);

}

Page 18: Manipulating Information (2) Arithmetic Operations

18

Mathematical Properties of TAdd

• Isomorphic Algebra to UAdd

– TAddw (u , v) = U2T (UAddw(T2U(u ), T2U(v)))

• Since both have identical bit patterns

– T2U(TAddw (u , v)) = UAddw(T2U(u ), T2U(v))

Page 19: Manipulating Information (2) Arithmetic Operations

19

Negating with Complement & Increment

• In C– ~x + 1 == -x

• Complement– Observation: ~x + x == 1111…111 == -1

• Increment– ~x + x + (-x + 1) == -1 + (-x + 1)– ~x + 1 == -x

1 0 0 1 0 11 1 x

0 1 1 0 1 00 0~x+

1 1 1 1 1 11 1-1

Page 20: Manipulating Information (2) Arithmetic Operations

20

Multiplication

• Computing Exact Product of w-bit numbers x, y– Either signed or unsigned

• Ranges– Unsigned: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1

• Up to 2w bits

– Two’s complement min: x *y ≥–2w–1*(2w–1–1) = –22w–2 + 2w–1

• Up to 2w–1 bits

– Two’s complement max: x * y ≤ (–2w–1) 2 = 22w–2

• Up to 2w bits, but only for TMinw2

Page 21: Manipulating Information (2) Arithmetic Operations

21

Multiplication

• Unsigned

• Signed

• Given two bit vectors and

• is identical to

x

y

Page 22: Manipulating Information (2) Arithmetic Operations

22

Multiplication

• Maintaining Exact Results– Would need to keep expanding word size with

each product computed

– Done in software by “arbitrary precision” arithmetic packages

Page 23: Manipulating Information (2) Arithmetic Operations

23

Power-of-2 Multiply with Shift

• • •

0 0 1 0 0 0•••

u

2 k*

u · 2kTrue Product: w+k bits

Operands: w bits

Discard k bits: w bits

UMultw(u , 2k)

•••

k

• • • 0 0 0•••

TMultw(u , 2k)

0 0 0••••••

Page 24: Manipulating Information (2) Arithmetic Operations

24

Power-of-2 Multiply with Shift

• Operation– u << k gives u * 2k

– Both signed and unsigned

• Examples– u << 3 == u * 8– u << 5 - u << 3 == u * 24– Most machines shift and add much faster

than multiply• Compiler will generate this code automatically

Page 25: Manipulating Information (2) Arithmetic Operations

25

Security Vulnerability in the XDR Library

1 /*

2 * Illustration of code vulnerability similar to that found in

3 * Sun’s XDR library.

4 */

Page 26: Manipulating Information (2) Arithmetic Operations

26

Security Vulnerability in the XDR Library

5 void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) {

6 /*

7 * Allocate buffer for ele_cnt objects, each of ele_size bytes

8 * and copy from locations designated by ele_src

9 */

10 void *result = malloc(ele_cnt * ele_size);

11 if (result == NULL)

12 /* malloc failed */

13 return NULL;

Page 27: Manipulating Information (2) Arithmetic Operations

27

Security Vulnerability in the XDR Library

14 void *next = result;

15 int i;

16 for (i = 0; i < ele_cnt; i++) {

17 /* Copy object i to destination */

18 memcpy(next, ele_src[i], ele_size);

19 /* Move pointer to next memory region */

20 next += ele_size;

21 }

22 return result;

23 }

Page 28: Manipulating Information (2) Arithmetic Operations

28

Machine-Level Representation of Programs

I

Page 29: Manipulating Information (2) Arithmetic Operations

29

Outline

• Memory and Registers

• Suggested reading

– Chap 3.1, 3.2, 3.3, 3.4

Page 30: Manipulating Information (2) Arithmetic Operations

30

Characteristics of the high level programming languages

• Abstraction – Productive– reliable

• Type checking• As efficient as hand written code• Can be compiled and executed on a

number of different machines

Page 31: Manipulating Information (2) Arithmetic Operations

31

Characteristics of the assembly programming languages

• Managing memory• Low level instructions to carry out the

computation• Highly machine specific

Page 32: Manipulating Information (2) Arithmetic Operations

32

Why should we understand the assembly code

• Understand the optimization capabilities of the compiler

• Analyze the underlying inefficiencies in the code

• Sometimes the run-time behavior of a program is needed

Page 33: Manipulating Information (2) Arithmetic Operations

33

From writing assembly code to understand assembly code

• Different set of skills– Transformations– Relation between source code and assembly

code

• Reverse engineering– Trying to understand the process by which a

system was created • By studying the system and • By working backward

Page 34: Manipulating Information (2) Arithmetic Operations

Understanding how compilation systems works

• Optimizing Program Performance

• Understanding link-time error

• Avoid Security hole

– Buffer Overflow

34

Page 35: Manipulating Information (2) Arithmetic Operations

35

C constructs

• Variable

– Different data types can be declared

• Operation

– Arithmetic expression evaluation

• control

– Loops

– Procedure calls and returns

Page 36: Manipulating Information (2) Arithmetic Operations

36

Code Examples

C codeint accum = 0;int sum(int x, int y){ int t = x+y; accum += t; return t;}

Page 37: Manipulating Information (2) Arithmetic Operations

37

Code Examples

C codeint accum = 0;int sum(int x, int y){ int t = x+y; accum += t; return t;}

_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eax

addl %eax, accummovl %ebp,%esppopl %ebpret

Obtain with command

gcc –O2 -S code.c

Assembly file code.s

Page 38: Manipulating Information (2) Arithmetic Operations

A Historical Perspective

• Long evolutionary development– Started from rather primitive 16-bit processors

– Added more features

• Take the advantage of the technology improvements

• Satisfy the demands for higher performance and for supporting more advanced operating systems

– Laden with features providing backward compatibility that are obsolete

38

Page 39: Manipulating Information (2) Arithmetic Operations

X86 family

• 8086(1978, 29K)– The heart of the IBM PC & DOS (8088)– 16-bit, 1M bytes addressable, 640K for users– x87 for floating pointing

• 80286(1982, 134K)– More (now obsolete) addressing modes– Basis of the IBM PC-AT & Windows

• i386(1985, 275K)– 32 bits architecture, flat addressing model– Support a Unix operating system

39

Page 40: Manipulating Information (2) Arithmetic Operations

X86 family

• I486(1989, 1.9M)– Integrated the floating-point unit onto the

processor chip

• Pentium(1993, 3.1M)– Improved performance, added minor extensions

• PentiumPro(1995, 5.5M)– P6 microarchitecture– Conditional mov

• Pentium II(1997, 7M)– Continuation of the P6

40

Page 41: Manipulating Information (2) Arithmetic Operations

X86 family

• Pentium III(1999, 8.2M)– New class of instructions for manipulating

vectors of floating-point numbers(SSE, Stream SIMD Extension)

– Later to 24M due to the incorporation of the level-2 cache

• Pentium 4(2001, 42M)– Netburst microarchitecture with high clock

rate but high power consumption– SSE2 instructions, new data types (eg. Double

precision)41

Page 42: Manipulating Information (2) Arithmetic Operations

X86 family

• Pentium 4E: (2004, 125Mtransistors). – Added hyperthreading

• run two programs simultaneously on a single processor

– EM64T, 64-bit extension to IA32 • First developed by Advanced Micro Devices

(AMD)• x86-64

• Core 2: (2006, 291Mtransistors)– back to a microarchitecture similar to P6– multi-core (multiple processors a single chip)– Did not support hyperthreading 42

Page 43: Manipulating Information (2) Arithmetic Operations

X86 family

• Core i7: (2008, 781 M transistors). – Incorporated both hyperthreading and multi-

core– the initial version supporting two executing

programs on each core

• Core i7: (2011.11, 2.27B transistors)– 6 cores on each chip– 3.3G– 6*256 KB (L2), 15M (L3)

43

Page 44: Manipulating Information (2) Arithmetic Operations

X86 family

• Advanced Micro Devices (AMD)– At beginning,

• lagged just behind Intel in technology, • produced less expensive and lower

performance processors

• In 1999– First broke the 1-gigahertz clock-speed

barrier

• In 2002– Introduced x86-64– The widely adopted 64-bit extension to IA32

44

Page 45: Manipulating Information (2) Arithmetic Operations

Moor’s Law

45

Page 46: Manipulating Information (2) Arithmetic Operations

46

C Code

• Add two signed integers

• int t = x+y;

Page 47: Manipulating Information (2) Arithmetic Operations

47

Assembly Code

• Operands:– x: Register %eax– y: Memory M[%ebp+8]– t: Register %eax

• Instruction– addl 8(%ebp),%eax– Add 2 4-byte integers– Similar to expression x +=y

• Return function value in %eax

Page 48: Manipulating Information (2) Arithmetic Operations

48

Assembly Programmer’s View

FF

BF

7F

3F

C0

80

40

00

Stack

DLLs

TextData

Heap

Heap

08

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

%al%ah

%dl%dh

%cl%ch

%bl%bh

%eip

%eflag

Addresses

Data

Instructions

Page 49: Manipulating Information (2) Arithmetic Operations

49

Programmer-Visible States

• Program Counter(%eip)

– Address of the next instruction

• Register File

– Heavily used program data

– Integer and floating-point

Page 50: Manipulating Information (2) Arithmetic Operations

50

Programmer-Visible States

• Conditional code register

– Hold status information about the most recently

executed instruction

– Implement conditional changes in the control

flow

Page 51: Manipulating Information (2) Arithmetic Operations

51

Operands

• In high level languages

– Either constants

– Or variable

• Example

– A = A + 4

vari

abl

e

constant

Page 52: Manipulating Information (2) Arithmetic Operations

52

Where are the variables? — registers & Memory

FF

BF

7F

3F

C0

80

40

00

Stack

DLLs

TextData

Heap

Heap

08

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

%al%ah

%dl%dh

%cl%ch

%bl%bh

%eip

%eflag

Addresses

Data

Instructions

Page 53: Manipulating Information (2) Arithmetic Operations

53

Operands

• Counterparts in assembly languages– Immediate ( constant )

– Register ( variable )

– Memory ( variable )

• Examplemovl 8(%ebp), %eaxaddl $4, %eax

memory

register

immediate

Page 54: Manipulating Information (2) Arithmetic Operations

54

Simple Addressing Mode

• Immediate– represents a constant – The format is $imm ($4, $0xffffffff)

• Registers – The fastest storage units in computer systems– Typically 32-bit long

– Register mode Ea

• The value stored in the register

• Noted as R[Ea]

Page 55: Manipulating Information (2) Arithmetic Operations

55

Virtual spaces

• A linear array of bytes– each with its own unique address (array index)

starting at zero

… … … …

0xffffffff

0xfffffffe

0x2

0x1

0x0

addressescontents

Page 56: Manipulating Information (2) Arithmetic Operations

56

Memory References

• The name of the array is annotated as M

• If addr is a memory address

• M[addr] is the content of the memory starting at addr

• addr is used as an array index

• How many bytes are there in M[addr]?– It depends on the context

Page 57: Manipulating Information (2) Arithmetic Operations

57

Indexed Addressing Mode

• An expression for – a memory address (or an array index)

• Most general form

– Imm(Eb, Ei, s)

– Constant “displacement” Imm: 1, 2 or 4 bytes

– Base register Eb: Any of 8 integer registers

– Index register Ei : Any, except for %esp

– S: Scale: 1, 2, 4, or 8

Page 58: Manipulating Information (2) Arithmetic Operations

58

Memory Addressing Mode

• The address represented by the above form

– imm + R[Eb] + R[Ei] * s

• It gives the value

– M[imm + R[Eb] + R[Ei] * s]

Page 59: Manipulating Information (2) Arithmetic Operations

59

Type Form Operand value Name

Immediate

$Imm Imm Immediate

Register Ea R[Ea] Register

Memory Imm M[Imm] Absolute

Memory (Ea) M[R[Ea]] Indirect

Memory Imm(Eb) M[Imm+ R[Eb]] Base+displacement

Memory (Eb, Ei) M[R[Eb]+ R[Ei]*s] Indexed

Memory Imm(Eb, Ei) M[Imm+ R[Eb]+ R[Ei]] Scaled indexed

Memory (, Ei, s) M[R[Ei]*s] Scaled indexed

Memory (Eb, Ei, s) M[R[Eb]+ R[Ei]*s] Scaled indexed

Memory Imm(Eb, Ei, s)

M[Imm+ R[Eb]+ R[Ei]*s]

Scaled indexed

Addressing Mode

Page 60: Manipulating Information (2) Arithmetic Operations

60

Address

Value

0x100 0xFF

0x104 0xAB

0x108 0x13

0x10C 0x11

Register

Value

%eax 0x100

%ecx 0x1

%edx 0x3

0x130x108

(0x108)0x13260(%ecx,%edx)

(0x10C)0x11(%eax,%edx,4)

0x108$0x108

0xFF(%eax)

0x100%eax

ValueOperand

Page 61: Manipulating Information (2) Arithmetic Operations

61

Code Examples

C codeint accum = 0;int sum(int x, int y){ int t = x+y; accum += t; return t;}

_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eax

addl %eax, accummovl %ebp,%esppopl %ebpretObtain with command

gcc –O2 -S code.c

Assembly file code.s

Page 62: Manipulating Information (2) Arithmetic Operations

62

Code Examples

55 89 e5 8b 45 0c 03 45 08 01 05 00 00 00 00 89 ec 5d c3

Obtain with command

gcc –O2 -c code.c

Relocatable object file code.o

Page 63: Manipulating Information (2) Arithmetic Operations

63

Code Examples

Obtain with command

objdump -d code.o

Disassembly output

0x80483b4 <sum>:0x80483b4 550x80483b5 89 e50x80483b7 8b 45 0c0x80483ba 03 45 080x80483bd 01 05 00 00 00 000x80483c3 89 ec0x80483c5 5d0x80483c6 c3

push %ebpmov %esp,%ebpmov 0xc(%ebp),%eaxadd 0x8(%ebp),%eaxadd %eax, 0x0mov %ebp,%esp pop %ebpret

Page 64: Manipulating Information (2) Arithmetic Operations

64

Object Code

• 3-byte instruction

• Stored at address 0x80483ba

• 0x80483ba: 03 45 08

Page 65: Manipulating Information (2) Arithmetic Operations

65

Operations in Assembly Instructions

• Performs only a very elementary operation

• Normally one by one in sequential

• Operate data stored in registers

• Transfer data between memory and a

register

• Conditionally branch to a new instruction

address

Page 66: Manipulating Information (2) Arithmetic Operations

66

Understanding Machine Execution

• Where the sequence of instructions are stored?– In virtual memory– Code area

• How the instructions are executed?– %eip stores an address of memory, from the

address, – machine can read a whole instruction once– then execute it – increase %eip

• %eip is also called program counter (PC)

Page 67: Manipulating Information (2) Arithmetic Operations

67

Code Layout

kernel virtual memory

Read only code

Read only data

Read/write data

forbidden

memory invisible to user code

Linux/x86

process

memory

image

0xffffffff

0xc0000000

0x08048000%eip

Page 68: Manipulating Information (2) Arithmetic Operations

68

Data layout

• Object model in assembly– A large, byte-addressable array– No distinctions even between signed or

unsigned integers– Code, user data, OS data– Run-time stack for managing procedure call

and return– Blocks of memory allocated by user

Page 69: Manipulating Information (2) Arithmetic Operations

69