794 lines
17 KiB
Markdown
794 lines
17 KiB
Markdown
# Machine Level Programming
|
|
|
|
## History of Intel Processors
|
|
|
|
* Eveolutionary design: **Backwards compatible** up until `8086` in 1978
|
|
|
|
* **C**omplex **I**nstruction **S**et **C**omputer (CISC)
|
|
|
|
**RISC vs CISC**
|
|
|
|
* CISC has variable length instructions
|
|
* RISC has constant length instructions
|
|
|
|
1. Intel x86(8086)
|
|
1. IA32 to IA64
|
|
2. (after x86-64) EM64T(almost same as AMD x86-64)
|
|
2. AMD x86-64
|
|
|
|
## C, Assembly, machine code
|
|
|
|
* ***Architecture***: The parts of a processor design that one needs to understand or write assembly/machine code
|
|
* **ISA(Instruction Set Architecture)**
|
|
* e.g., x86, IA32, Itanium, x86-64, ARM
|
|
* ***Microarchitecture***: Implementation of the architecture
|
|
|
|
form of code:
|
|
|
|
* Machine Code: the byte-level programs that a processor executes
|
|
* Assembly Code: A text representation of machine code
|
|
|
|
### Assembly/Machine Code View
|
|
|
|
Programmer-Visible State (shown by ISAs)
|
|
* PC(Program Counter)
|
|
* Address of next instruction
|
|
* RIP in (x86-64)
|
|
* Register file
|
|
* Heavily used program data
|
|
* Condition codes
|
|
* store status information about most recent arithmetic or logical op
|
|
* Used for conditional branching
|
|
* Memory(external)
|
|
|
|
Compiling Into Assembly
|
|
```c {cmd=gcc, args=[-Og -x c -c $input_file -o 3_1.o]}
|
|
long plus(long x, long y);
|
|
|
|
void sumstore(long x, long y, long *dest) {
|
|
long t = plus(x, y);
|
|
*dest = t;
|
|
}
|
|
```
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_1.o ]; do sleep .1; done; objdump -d 3_1.o
|
|
```
|
|
|
|
### Integer Registers
|
|
|
|
* In x86-64
|
|
`ax` `bx` `cx` `dx` `si` `di` `sp` `bp` (in 8bytes `r` 4bytes `e`)
|
|
`r8` `r9` `r10` `r11` `r12` `r13` `r14` `r15` (in 4bytes: add `d`)
|
|
|
|
* In IA32
|
|
`eax`(32bit): 16bit `ax`(`ah`, `al`); origin: accumulate
|
|
`ecx`(32bit): 16bit `cx`(`ch`, `cl`); origin: counter
|
|
`edx`(32bit): 16bit `bx`(`bh`, `bl`); origin: data
|
|
`ebx`(32bit): 16bit `dx`(`dh`, `dl`); origin: base
|
|
`esi`(32bit): 16bit `si`(`sih`, `sil`); origin: source index
|
|
`edi`(32bit): 16bit `di`(`dil`, `dil`); origin: destination index
|
|
`esp`(32bit): 16bit `sp`(`spl`, `spl`); origin: stack pointer
|
|
`ebp`(32bit): 16bit `bp`(`bpl`, `bpl`); origin: base pointer
|
|
|
|
#### understanding `movq`
|
|
|
|
In x86asm, There are three operand types: **immediate, register, memory**
|
|
* immediate: constant integer data like `$0x400` `$-533`
|
|
* register: one of 16 integer regs
|
|
* e.g., `%rax`, `%r13`
|
|
* but `%rsp` is reserved for special use
|
|
* memory: 8 bytes of memory at address given by register
|
|
* e.g., `(%rax)`
|
|
|
|
**movq usage**
|
|
`movq $src, $dest`
|
|
|
|
* limit: Cannot do memory-memory transfer with a single instruction(because memory is external device to cpu)
|
|
|
|
**memory addressing modes**
|
|
|
|
`(R)` means `Mem[Reg[R]]`
|
|
`D(R)` means `Mem[Reg[R]+D]`, constant displacement `D` specifies offset
|
|
|
|
`movq 8(%rbp), %rdx`
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
int swap (long *xp, long *yp) {
|
|
long t0 = *xp;
|
|
long t1 = *yp;
|
|
*xp = t1;
|
|
*yp = t0;
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```nasm
|
|
swap:
|
|
movq (%rdi), %rax
|
|
movq (%rsi), %rdx
|
|
movq %rdx, (%rdi)
|
|
movq %rax, (%rsi)
|
|
ret
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
Complete form of memory addressing modes:
|
|
`D(Rb, Ri, S)` means `Mem[Reg[Rb] + S*Reg[Ri] + D]`
|
|
* `D`: Constant "displacement"
|
|
* `Rb`: Base Register
|
|
* `Ri`: Index Register
|
|
* `S`: Scale Factor(1, 2, 4, or 8)
|
|
|
|
for example:
|
|
|
|
| `%rdx` | `%rcx` |
|
|
| -------- | -------- |
|
|
| `0xf000` | `0x0100` |
|
|
|
|
* `0x8(%rdx)` = `0xf008`
|
|
* `(%rdx, %rcx)` = `0xf100`
|
|
* `(%rdx, %rcx, 4)` = `0xf400`
|
|
* `0x80(,%rdx, 2)` = `0x1e080`
|
|
|
|
#### Arithmetic & Logical Operations
|
|
|
|
* `leaq $src, $dst`
|
|
* computing address without memory reference like `p = &x[i]`
|
|
* computing arithmetic expression `x + k * y`
|
|
|
|
* `addq $src, $dst`
|
|
* `subq $src, $dst`
|
|
* `imulq $src, $dst`
|
|
* `salq $src, $dst`
|
|
* `sarq $src, $dst`
|
|
* `shrq $src, $dst`
|
|
* `xorq $src, $dst`
|
|
* `andq $src, $dst`
|
|
* `orq $src, $dst`
|
|
all the above operator operates like `dest = dest # src`
|
|
|
|
* `incq $dest`
|
|
* `decq $dest`
|
|
* `negq $dest`
|
|
* `notq $dest`
|
|
|
|
## Control
|
|
|
|
**Processor State(x86-64, Partial)**
|
|
* Temporary data(`%rax`, ...)
|
|
* Location of runtime stack(`%rsp`)
|
|
* Location of current code control point(`%rip`, instruction point)
|
|
* Status of recent tests(`CF`, `ZF`, `SF`, `OF`)
|
|
|
|
### Condition Codes
|
|
|
|
* Single bit registers
|
|
* `CF` Carry flag (for unsigned)
|
|
* `SF` Sign flag (for signed)
|
|
* `ZF` Zero flag
|
|
* `OF` Overflow flag (for signed)
|
|
|
|
**Conditional Codes(Implicit Setting)**
|
|
|
|
Implicit setting is codes are set by arithmetic operations(`addq`, `subq`, `mulq`)
|
|
for example: `addq`: `t = a + b`
|
|
* `CF` set if carry out from most significant bit or unsigned overflow
|
|
* `ZF` set if `t == 0`
|
|
* `SF` set if `t < 0` (as signed)
|
|
* `OF` set if two's-complement overflow or signed overflow
|
|
`(a > 0 && b > 0 && (a + b) < 0) || (a < 0 && b < 0 && (a + b) >= 0)`
|
|
|
|
The codes are not implictly set by `leaq`, because it is not designed to be used as arithmetic but used as **address calculation**. so it cannot affect to conditional codes.
|
|
|
|
**Conditional Codes(Explicit Setting)**
|
|
|
|
The codes are set explictly by compare instruction.
|
|
|
|
`cmpq b, a` is computing `a - b` without setting destination.
|
|
|
|
* `CF` set if carry out from most significant bit or unsigned overflow
|
|
* `ZF` set if `a == b` or `a - b == 0`
|
|
* `SF` set if `(a - b) < 0` (as signed)
|
|
* `OF` set if two's-complement overflow or signed overflow
|
|
`(a > 0 && b > 0 && (a - b) < 0) || (a < 0 && b < 0 && (a - b) >= 0)`
|
|
|
|
And explictly set by test instruction
|
|
|
|
`testq b, a` is computing `a & b` without setting destination.
|
|
|
|
Sets condition codes based on value of `a & b` it is useful to have one of the operands be a mask.
|
|
|
|
* `ZF` set when `a & b == 0`
|
|
* `SF` set when `a & b < 0`
|
|
|
|
**Reading Condition Codes**
|
|
|
|
`setX`: set single byte based on combination of condition codes
|
|
|
|
| setX | effect | desc |
|
|
| ------- | ---------------- | ------------------------- |
|
|
| `sete` | `ZF` | Equal / Zero |
|
|
| `setne` | `~ZF` | Not Equal / Not Zero |
|
|
| `sets` | `SF` | Negative |
|
|
| `setns` | `~SF` | Nonnegative |
|
|
| `setg` | `~(SF^OF) & ~ZF` | Greater (signed) |
|
|
| `setge` | `~(SF^OF)` | Greater or Equal (signed) |
|
|
| `setl` | `SF^OF` | Less (signed) |
|
|
| `setle` | `(SF^OF) \| ZF` | Less or Equal (signed) |
|
|
| `seta` | `~CF & ~ZF` | Above (unsigned) |
|
|
| `setb` | `CF` | Below (unsigned) |
|
|
|
|
it deos not alter remaining bytes of registers. only use 1 byte register(`%al`, `%bl`)
|
|
|
|
```nasm
|
|
cmpq %rsi(y), %rdi(x) # compare x and y
|
|
setg %al # set when >(greater)
|
|
movzbl %al, %eax # move zero extend byte to long
|
|
ret
|
|
```
|
|
|
|
### Conditional Branches
|
|
|
|
#### Jumping
|
|
|
|
`jX` jump to different part of code depending on condition codes.
|
|
|
|
| jX | condition | desc |
|
|
| ----- | ---------------- | ------------------------- |
|
|
| `jmp` | 1 | Unconditional |
|
|
| `je` | `ZF` | Equal / Zero |
|
|
| `jne` | `~ZF` | Not Equal / Not Zero |
|
|
| `js` | `SF` | Negative |
|
|
| `jns` | `~SF` | Nonnegative |
|
|
| `jg` | `~(SF^OF) & ~ZF` | Greater (signed) |
|
|
| `jge` | `~(SF^OF)` | Greater or Equal (signed) |
|
|
| `jl` | `SF^OF` | Less (signed) |
|
|
| `jle` | `(SF^OF) \| ZF` | Less or Equal (signed) |
|
|
| `ja` | `~CF & ~ZF` | Above (unsigned) |
|
|
| `jb` | `CF` | Below (unsigned) |
|
|
|
|
Old Style Conditional Branch
|
|
|
|
```c {cmd=gcc args=[-Og -x c -fno-if-conversion -c $input_file -o 3_3.o]}
|
|
long absdiff(long x, long y) {
|
|
long result;
|
|
if (x > y) result = x - y;
|
|
else result = y - x;
|
|
return result;
|
|
}
|
|
```
|
|
|
|
```sh { cmd hide }
|
|
while ! [ -r 3_3.o ]; do sleep .1; done; objdump -d 3_3.o -Msuffix
|
|
```
|
|
|
|
**expressing with `goto`**
|
|
|
|
```c {cmd=gcc args=[-Og -x c -rno-if-conversion -c $input_file -o 3_4.o]}
|
|
long absdiff_j(long x, long y) {
|
|
long result;
|
|
int ntest = x <= y;
|
|
if (ntest) goto Else;
|
|
result = x-y;
|
|
goto Done;
|
|
Else:
|
|
result = y-x;
|
|
Done:
|
|
return result;
|
|
}
|
|
```
|
|
|
|
#### Conditional Move
|
|
|
|
But this branchings are very disruptive to instruction flow through pipelines, **Conditional Moves** are highly used because they do not require control transfer.
|
|
|
|
```c {cmd=gcc args=[-O3 -x c -c $input_file -o 3_5.o]}
|
|
long absdiff(long x, long y) {
|
|
long result;
|
|
if (x > y) result = x - y;
|
|
else result = y - x;
|
|
return result;
|
|
}
|
|
```
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_5.o ]; do sleep .1; done; objdump -d 3_5.o -Msuffix
|
|
```
|
|
|
|
However, there are several *bad cases* for conditional move.
|
|
|
|
* expansive computations
|
|
```c
|
|
val = Test(x) ? Hard1(x) : Hard2(x);
|
|
```
|
|
because both values are get computed. only simple computations are effective for conditional moves.
|
|
* risky computations
|
|
```c
|
|
val = p ? *p : 0;
|
|
```
|
|
both values get computed may have undesiarable effects.
|
|
* Computations with side effects
|
|
```c
|
|
val = x > 0 ? x*=7 : x+=3;
|
|
```
|
|
each expression has side-effect.
|
|
|
|
### Loop
|
|
|
|
#### do-while
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
long pcount_do(unsigned long x) {
|
|
long result = 0;
|
|
do {
|
|
result += x & 0x1;
|
|
x >>= 1;
|
|
} while (x);
|
|
return result;
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c {cmd=gcc args=[-Og -x c -c $input_file -o 3_6.o]}
|
|
long pcount_goto(unsigned long x) {
|
|
long result = 0;
|
|
loop:
|
|
result += x & 0x1;
|
|
x >>= 1;
|
|
if (x) goto loop;
|
|
return result;
|
|
}
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_6.o ]; do sleep .1; done; objdump -d 3_6.o -Msuffix
|
|
```
|
|
|
|
**general do-while translation**
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
do {
|
|
Body
|
|
} while (Test);
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c
|
|
loop:
|
|
Body
|
|
if (Test) goto loop;
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
#### while
|
|
|
|
**general while translation#1**
|
|
|
|
it is called **jump-to-middle translation**, used with `-O0` (or `-Og`) flag.
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
while(Test) {
|
|
Body
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c
|
|
goto test;
|
|
loop:
|
|
Body
|
|
test:
|
|
if (Test)
|
|
goto loop;
|
|
done:
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
```c {cmd=gcc args=[-Og -x c -c $input_file -o 3_7.o]}
|
|
long pcount_while(unsigned long x) {
|
|
long result = 0;
|
|
while (x) {
|
|
result += x & 0x1;
|
|
x >>= 1;
|
|
}
|
|
return result;
|
|
}
|
|
```
|
|
```sh {cmd hide}
|
|
echo "jmp-to-middle translation"
|
|
while ! [ -r 3_7.o ]; do sleep .1; done; objdump -d 3_7.o -Msuffix
|
|
```
|
|
|
|
**general while translation#2**
|
|
|
|
while to do-while conversion, used with `-O1` flag.
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
while(Test) {
|
|
Body
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c
|
|
if (!Test) goto done;
|
|
do {
|
|
Body
|
|
} while (Test);
|
|
done:
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c
|
|
if (!Test) goto done;
|
|
loop:
|
|
Body
|
|
if (Test) goto loop;
|
|
done:
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
```c {cmd=gcc args=[-O1 -x c -c $input_file -o 3_8.o]}
|
|
long pcount_while(unsigned long x) {
|
|
long result = 0;
|
|
while (x) {
|
|
result += x & 0x1;
|
|
x >>= 1;
|
|
}
|
|
return result;
|
|
}
|
|
```
|
|
```sh {cmd hide}
|
|
echo "while to do-while conversion"
|
|
while ! [ -r 3_8.o ]; do sleep .1; done; objdump -d 3_8.o -Msuffix
|
|
```
|
|
|
|
#### for loop form
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
for (init; test; update) {
|
|
Body
|
|
}
|
|
```
|
|
</td>
|
|
|
|
</tr>
|
|
</table>
|
|
|
|
**for-to-while conversion**
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c
|
|
for (Init; Test; Update) {
|
|
Body
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```c
|
|
Init;
|
|
while(Test) {
|
|
Body
|
|
Update;
|
|
}
|
|
```
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
|
|
```c {cmd=gcc args=[-O3 -x c -c $input_file -o 3_9.o]}
|
|
#include <stddef.h>
|
|
#define WSIZE 8 * sizeof(int)
|
|
|
|
long pcount_for(unsigned long x) {
|
|
size_t i;
|
|
long result = 0;
|
|
for (i = 0; i < WSIZE; i++) {
|
|
unsigned bit = (x >> i) & 0x1;
|
|
result += bit;
|
|
}
|
|
return result;
|
|
}
|
|
```
|
|
</td> <td>
|
|
|
|
```c {cmd=gcc args=[-O3 -x c -c $input_file -o 3_10.o]}
|
|
#include <stddef.h>
|
|
#define WSIZE 8 * sizeof(int)
|
|
long pcount_for(unsigned long x) {
|
|
size_t i;
|
|
long result = 0;
|
|
i = 0;
|
|
while(i < WSIZE) {
|
|
unsigned bit = (x >> i) & 0x1;
|
|
result += bit;
|
|
i++;
|
|
}
|
|
return result;
|
|
}
|
|
```
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_9.o ]; do sleep .1; done; objdump -d 3_9.o -Msuffix
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_10.o ]; do sleep .1; done; objdump -d 3_10.o -Msuffix
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
for to do-while conversion, initial test can be optimized away.
|
|
|
|
### Switch
|
|
|
|
#### Jump Table Structure
|
|
|
|
Switch form
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
|
|
```c {cmd=gcc args=[-Og -fno-asynchronous-unwind-tables -fno-stack-protector -x c -S $input_file -o 3_11.s]}
|
|
long switch_eg (long x, long y, long z) {
|
|
long w = 1;
|
|
switch(x) {
|
|
case 1:
|
|
w = y*z;
|
|
break;
|
|
case 2:
|
|
w = y/z;
|
|
/* Fall Through */
|
|
case 3:
|
|
w += z;
|
|
break;
|
|
case 5:
|
|
case 6:
|
|
w -= z;
|
|
break;
|
|
case 7:
|
|
w *= z;
|
|
break;
|
|
default:
|
|
w = 2;
|
|
}
|
|
return w;
|
|
}
|
|
```
|
|
</td>
|
|
<td>
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_11.s ]; do sleep .1; done; cat 3_11.s
|
|
```
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
## Procedures
|
|
|
|
Mechanisms in Procedures
|
|
|
|
* **Passing control**
|
|
* to beginning of procedure code
|
|
* back to return point
|
|
* **Passing data**
|
|
* procedure arguments
|
|
* return value
|
|
* **Memory management**
|
|
* allocate during procedure execution
|
|
* deallocate upon return
|
|
|
|
this mechanisms are all implemented with machine instructions. **x86-64 implementation** of a procedure used only those mechanisms required.
|
|
|
|
### Stack Structure
|
|
|
|
**x86-64 Stack**
|
|
|
|
Region of memory managed with *stack discipline*. It grows toward lower addresses. `%rsp` contains lowest stack address(address of top element).
|
|
|
|
`pushq $src`
|
|
* fetches operand at src
|
|
* decrement `%rsp` by 8
|
|
* write operand at address given by `%rsp`
|
|
|
|
`popq $dest`
|
|
* read value at address given by `%rsp`
|
|
* increment `%rsp` by 8
|
|
* store value at dest(must be register)
|
|
|
|
### Procedure Control Flow
|
|
|
|
```c {cmd=gcc args=[-Og -x c -c $input_file -o 3_12.o]}
|
|
long mult2(long x, long y) {
|
|
long t = x * y;
|
|
return t;
|
|
}
|
|
|
|
void multstore(long x, long y, long *dest) {
|
|
long t = mult2(x, y);
|
|
*dest = t;
|
|
}
|
|
|
|
```
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_12.o ]; do sleep .1; done; objdump -d 3_12.o -Msuffix
|
|
```
|
|
|
|
Procedure call `call label`
|
|
* push return address on stack
|
|
* jmp to label
|
|
Return address:
|
|
* Address of the next instruction right after call
|
|
Procedure return: `ret`
|
|
|
|
### Procedure Data Flow
|
|
|
|
* registers
|
|
* first 6 args: `%rdi`, `%rsi`, `%rdx`, `%rcx`, `%r8`, `%r9`
|
|
* return value: `rax`
|
|
* stack
|
|
|
|
for example with above example
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_12.o ]; do sleep .1; done; objdump -d 3_12.o -Msuffix
|
|
```
|
|
|
|
* with above `mult2` variable `t` is already stored in `%rax`
|
|
* so `movq %rax,(%rbx)` where `%rbx` is `long*dest`
|
|
|
|
### Managing local data
|
|
|
|
**Stack-Based Languages**
|
|
|
|
In languages that support recursion
|
|
* Code must be "reentrant", which means multiple simultaneous instantiations of single procedure.
|
|
* Need some place to store ***state*** of each instantiation: (**args**, **local variables**, **return pointer**)
|
|
|
|
In order to get this, **stack discipline** is used. state for given procedure needed for limited time(from called to return): Calle returns before caller does.
|
|
|
|
Stack allocated in **frames**, state for single procdure instantiation.
|
|
When function is called, a new stack frame is created above stack top. And then when the function is returned, a corresponding frame is popped. and return to previous call state.
|
|
|
|
#### Stack Frame
|
|
|
|
is consist of **return information**, **local storage(if needed)** and **temporary space(if needed)**.
|
|
|
|
* `%rbp` frame pointer
|
|
* `%rsp` stack pointer
|
|
|
|
Space allocated when enter procedure, "set-up" code and includes push by `call`.
|
|
Deallocated when return, "finish" code and includes pop by `ret`.
|
|
|
|
#### x86-64/Linux Stack Frame
|
|
|
|

|
|
|
|
* Arguments
|
|
* Local variables
|
|
* Old `rbp`
|
|
|
|
### Register Saving Conventions
|
|
|
|
When calling function, the temporary value of registers could be removed by called function, it could be trouble. So there are **conventions** to save the registers value.
|
|
|
|
When procedure `yoo` calls `who`: `yoo` is `caller`, `who` is `callee`
|
|
* Caller saves temporary values in its frame before the call.
|
|
* Callee saves saves temporary values in its frame before using and restores them before returning to caller.
|
|
|
|
|
|
#### x86-64 Linux Register Usage
|
|
|
|
`%rbx`, `%r12`, `%r13`, `%r14`, `%r15`
|
|
* Callee-saved
|
|
* Callee must save & restore
|
|
|
|
`%rbp`
|
|
* Callee-saved
|
|
* Callee must save & restore
|
|
* May be used as frame pointer by callee
|
|
* Can mix & match
|
|
|
|
`%rsp`
|
|
* Special form of callee-saved
|
|
* Restored to original value upon exit from procedure
|
|
|
|
#### EX
|
|
|
|
* for compile w/o *stack canary*, add option `-fno-stack-protector`
|
|
```c {cmd=gcc args=[-Og -x c -fno-stack-protector -c $input_file -o 3_13.o]}
|
|
long incr(long *p, long val) {
|
|
long x = *p;
|
|
long y = x + val;
|
|
*p = y;
|
|
return x;
|
|
}
|
|
long call_incr() {
|
|
long v1 = 15213;
|
|
long v2 = incr(&v1, 3000);
|
|
return v1 + v2;
|
|
}
|
|
```
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_13.o ]; do sleep .1; done; objdump -d 3_13.o -Msuffix
|
|
```
|
|
|
|
### Recursive Function
|
|
|
|
```c {cmd=gcc args=[-O1 -x c -fno-stack-protector -c $input_file -o 3_14.o]}
|
|
long pcount_r(unsigned long x) {
|
|
if (x == 0) {
|
|
return 0;
|
|
} else {
|
|
return (x & 1) + pcount_r(x >> 1);
|
|
}
|
|
}
|
|
```
|
|
|
|
```sh {cmd hide}
|
|
while ! [ -r 3_14.o ]; do sleep .1; done; objdump -d 3_14.o -Msuffix
|
|
```
|
|
|
|
Recursion is not a special function.
|
|
* Stack frames mean that each function call has private storage.
|
|
* Register saving conventions prevent one function call from corrupting another's data. *unless the explictly corrupting like buffer overflow*
|
|
* Stack discipline follows call/return pattern LIFO
|
|
|