Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov...
Transcript of Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov...
![Page 1: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/1.jpg)
Adam Blank Fall 2020Lecture 3
CS24
Introduction to Computing
Systems
![Page 2: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/2.jpg)
CS 24: Introduction to Computing Systems
x86-64 Basics
mov %edi, %eax
xor %ecx, %ecx
test %edi, %edi
setne %cl
mov $0xffffffff, %eax
cmovns %ecx, %eax
retq
![Page 3: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/3.jpg)
Assembly? Why Bother? 1
In 2020, almost all assembly is written by compilers or in theoperating system, but who writes those?
In 2020, the high-level language model occasionally breaks down andyou have to read the assembly to understand the machine’s behavior!
In 2020, it’s important to understand the types of optimizations acompiler is capable of making–and those it isn’t!
In 2020, software is generally distributed in binary form; if you wantto reverse engineer or security audit software, it’s going to beassembly!
![Page 4: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/4.jpg)
Assembly? Why Bother? 1
In 2020, almost all assembly is written by compilers or in theoperating system, but who writes those?
In 2020, the high-level language model occasionally breaks down andyou have to read the assembly to understand the machine’s behavior!
In 2020, it’s important to understand the types of optimizations acompiler is capable of making–and those it isn’t!
In 2020, software is generally distributed in binary form; if you wantto reverse engineer or security audit software, it’s going to beassembly!
![Page 5: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/5.jpg)
Assembly? Why Bother? 1
In 2020, almost all assembly is written by compilers or in theoperating system, but who writes those?
In 2020, the high-level language model occasionally breaks down andyou have to read the assembly to understand the machine’s behavior!
In 2020, it’s important to understand the types of optimizations acompiler is capable of making–and those it isn’t!
In 2020, software is generally distributed in binary form; if you wantto reverse engineer or security audit software, it’s going to beassembly!
![Page 6: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/6.jpg)
Assembly? Why Bother? 1
In 2020, almost all assembly is written by compilers or in theoperating system, but who writes those?
In 2020, the high-level language model occasionally breaks down andyou have to read the assembly to understand the machine’s behavior!
In 2020, it’s important to understand the types of optimizations acompiler is capable of making–and those it isn’t!
In 2020, software is generally distributed in binary form; if you wantto reverse engineer or security audit software, it’s going to beassembly!
![Page 7: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/7.jpg)
Assembly? Why Bother? 1
In 2020, almost all assembly is written by compilers or in theoperating system, but who writes those?
In 2020, the high-level language model occasionally breaks down andyou have to read the assembly to understand the machine’s behavior!
In 2020, it’s important to understand the types of optimizations acompiler is capable of making–and those it isn’t!
In 2020, software is generally distributed in binary form; if you wantto reverse engineer or security audit software, it’s going to beassembly!
![Page 8: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/8.jpg)
Compiling a Program 2
blank@compute-cpu2:~$ cat identity.cint identity(int x) {
return x;}
blank@compute-cpu2:~$ clang -S identity.cblank@compute-cpu2:~$ cat identity.sidentity:
movl %edi, %eaxretq
![Page 9: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/9.jpg)
Disassembling a Program 3
blank@compute-cpu2:~$ cat identity.cint identity(int x) {
return x;}
blank@compute-cpu2:~$ clang -c identity.cblank@compute-cpu2:~$ objdump -d identity.o
simple.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <identity>:0: 89 f8 mov %edi,%eax2: c3 retq
![Page 10: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/10.jpg)
What kinds of “data“ are there? 4
Immediates: Constant integer dataExamples: $0x400, $-533Like C literal, but prefixed with ‘$’Encoded with 1, 2, 4, or 8 bytes depending on the instruction
Registers: behave like “global variables”, but hardwired in theprocessor
Examples: %eax, %ediSome of them are reserved for special uses or have special meanings
Memory: Consecutive bytes of memory
![Page 11: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/11.jpg)
What are these %eax thingies? 5
Registers are locations in the CPU that store a small amount of data,which can be accessed very quickly (once every clock cycle). They havenames (e.g., %rsi)–not addresses. They are a precious commodity in allarchitectures, but especially x86-64.
%eax
%ebx
%ecx
%edx
%edi
%esi
%esp
%ebp
(return)
(arg 4)
(arg 3)
(arg 1)
(arg 2)
![Page 12: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/12.jpg)
x86-64 Instructions 6
There are three major classes of things assembly instructions do:
1 Transfer data between memory and registersLoad data from memory into register: %reg = mem[addr]Store register data into memory: mem[addr] = %reg
2 Perform arithmetic operation on register or memory datae.g., %eax += %ebxe.g., %eax += mem[addr]
3 Re-direct control flow (jumps and gotos)
![Page 13: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/13.jpg)
Understanding Identity (mov) 7
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
Pseudocode Translation (so far)1 identity:2 %eax = %edi3 retq
![Page 14: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/14.jpg)
Understanding Identity (mov) 7
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
Pseudocode Translation (so far)1 identity:2 %eax = %edi3 retq
![Page 15: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/15.jpg)
Understanding Identity (mov) 7
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
Pseudocode Translation (so far)1 identity:2 %eax = %edi3 retq
![Page 16: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/16.jpg)
Understanding Identity (retq and return value) 8
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity:2 %eax = %edi3 return %eax
![Page 17: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/17.jpg)
Understanding Identity (retq and return value) 8
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity:2 %eax = %edi3 return %eax
![Page 18: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/18.jpg)
Understanding Identity (arguments) 9
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity(%edi):2 %eax = %edi3 return %eax
![Page 19: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/19.jpg)
Understanding Identity (arguments) 9
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity(%edi):2 %eax = %edi3 return %eax
![Page 20: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/20.jpg)
Understanding Identity (simplifying) 10
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity(%edi):2 return %edi
![Page 21: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/21.jpg)
Understanding Identity (simplifying) 10
C Code1 int identity(int x) {2 return x;3 }
x86-64 Disassembly0000000000400480 <identity>:400480: 89 f8 mov %edi,%eax400482: c3 retq
x86-64 Instruction: movx86-64 C Pseudocode
mov %src, %dst ↔ %dst = %src
x86-64 Instruction: retqx86-64 C Pseudocode
retq ↔ return %eax
Pseudocode Translation (so far)1 identity(%edi):2 return %edi
![Page 22: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/22.jpg)
x86-64 Summary So Far 11
System V AMD64 ABIThe value in %eax is automatically returned by retq.The first argument to a function is stored in %edi.
Things to Notice About x86-64There are no types!!!The conventions define what the compiler is allowed to do.
![Page 23: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/23.jpg)
Understanding Arithmetic (arithmetic instructions) 12
0000000000000000 <arith>:0: 83 c7 01 add $0x1,%edi3: 0f af ff imul %edi,%edi6: 89 f8 mov %edi,%eax8: c3 retq
Pseudocode Translation (so far)1 arith(%edi):2 add $0x1,%edi3 imul %edi,%edi4 %eax = %edi5 return %eax
x86-64 Instruction: (Some) Arithmetic Operations
x86-64 C Pseudocode
add %src, %dst ↔ %dst += %src
sub %src, %dst ↔ %dst -= %src
imul %src, %dst ↔ %dst *= %src
![Page 24: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/24.jpg)
Understanding Arithmetic (arithmetic instructions) 12
0000000000000000 <arith>:0: 83 c7 01 add $0x1,%edi3: 0f af ff imul %edi,%edi6: 89 f8 mov %edi,%eax8: c3 retq
Pseudocode Translation (so far)1 arith(%edi):2 add $0x1,%edi3 imul %edi,%edi4 %eax = %edi5 return %eax
x86-64 Instruction: (Some) Arithmetic Operations
x86-64 C Pseudocode
add %src, %dst ↔ %dst += %src
sub %src, %dst ↔ %dst -= %src
imul %src, %dst ↔ %dst *= %src
![Page 25: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/25.jpg)
Understanding Arithmetic (arithmetic instructions) 12
0000000000000000 <arith>:0: 83 c7 01 add $0x1,%edi3: 0f af ff imul %edi,%edi6: 89 f8 mov %edi,%eax8: c3 retq
Pseudocode Translation (so far)1 arith(%edi):2 add $0x1,%edi3 imul %edi,%edi4 %eax = %edi5 return %eax
x86-64 Instruction: (Some) Arithmetic Operations
x86-64 C Pseudocode
add %src, %dst ↔ %dst += %src
sub %src, %dst ↔ %dst -= %src
imul %src, %dst ↔ %dst *= %src
![Page 26: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/26.jpg)
Understanding Arithmetic (arithmetic operations) 13
0000000000000000 <arith>:0: 83 c7 01 add $0x1,%edi3: 0f af ff imul %edi,%edi6: 89 f8 mov %edi,%eax8: c3 retq
x86-64 Instruction: (Some) Arithmetic Operations
x86-64 C Pseudocode
add %src, %dst ↔ %dst += %src
sub %src, %dst ↔ %dst -= %src
imul %src, %dst ↔ %dst *= %src
Pseudocode Translation (so far)1 arith(%edi):2 %edi += $0x13 %edi *= %edi4 %eax = %edi5 return %eax
![Page 27: Adam Blank Fall 2020Lecture 3 CS 24 · CS 24: Introduction to Computing Systems x86-64 Basics mov %edi, %eax xor %ecx, %ecx test %edi, %edi setne %cl mov $0xffffffff, %eax cmovns](https://reader035.fdocuments.in/reader035/viewer/2022081617/6044720280e8b20e801586e0/html5/thumbnails/27.jpg)
Understanding Arithmetic (simplifying) 14
0000000000000000 <arith>:0: 83 c7 01 add $0x1,%edi3: 0f af ff imul %edi,%edi6: 89 f8 mov %edi,%eax8: c3 retq
x86-64 Instruction: (Some) Arithmetic Operations
x86-64 C Pseudocode
add %src, %dst ↔ %dst += %src
sub %src, %dst ↔ %dst -= %src
imul %src, %dst ↔ %dst *= %src
Pseudocode Translation (so far)1 arith(%edi):2 return (%edi + $0x1) * (%edi + $0x1)