CS 208 w20 lecture 13 outline
1 Warmup
%rsp is 0x100, stack frame holds local array of 4 ints, where is return address? (0x110)
2 Compilation
- zoom out a little from assembly to look briefly at the whole compilation process
- code in files
p1.c,p2.c - compile with command:
gcc -Og p1.c p2.c -o p- put resulting machine code in file
p - run with command: ./p
- put resulting machine code in file
2.1 Producing Machine Language
- simple cases: arithmetic and logical operations, shifts, etc.
- all necessary information is contained in the instruction itself
- but consider
- conditional jump
- accessing static data
call
- addresses and labels are a problem because we don't have the final executable yet
3 Array Basics
T A[N]means an array ofNelements of typeT- contiguously allocated region—how big in terms of
N? Ais a pointer (T*) to the start of the array
3.1 Array Access
int x[5] = {3, 7, 1, 9, 5};- indexes
0,1,2,3,4 - addresses
a,a+4,a+8,a+12,a+16,a+20(at the end)
| Expression | Type | Value |
|---|---|---|
x[4] |
int |
5 |
x |
int* |
a |
x + 1 |
int* |
a+4 |
&x[2] |
int* |
a+8 |
x[5] |
int |
?? (whatever's there in memory) |
*(x + 1) |
int |
7 |
x + i |
int* |
a + 4*i |
3.2 Pointer Arithmetic
- C allows pointer arithmetic where the result is scaled according to the size of the data type referenced by the pointer
- array subscripting is the combination of pointer arithmetic and dereference (e.g.,
A[i]is equivalent to*(A+i)) int *numsandint nums[]are nearly identical declarations- subtle differences include initialization,
sizeof
- subtle differences include initialization,
- an array name is an expression (not a variable) that returns the address of the array
- it looks like a pointer to the first (0th) element
*arsame asar[0],*(ar+2)same asar[2]
- it looks like a pointer to the first (0th) element
- an array name is readâonly (no assignment) because it is a label
- cannot use
ar = <anything>
- cannot use
int get_digit(int z[5], int digit) { return z[digit]; }
get_digit: movq (%rdi,%rsi,4), %rax ret
3.3 Exercise
for (long i = 0; i < size; i++) { total += arr[i]; }
| Register | Use |
|---|---|
%rdi |
arr |
%rsi |
size |
%rdx |
i |
%rax |
total |
init: movl $0, %edx jmp test body: addl (%rdi, %rdx, 4), %eax addq $1, %rdx test: cmpq %rsi, %rdx jl body
4 Arrays and Functions
- arrays declared as local variables are allocated in the current stack frame
char* foo() { char string[32]; ...; return string; }
- the above code is broken, future function calls will overwrite
string(draw stack frame) - an array is passed to a function as a pointer—this means the size gets lost!
int foo(int arr[], unsigned int size) { ... arr[size - 1] ... }
arris really anint*(%rdican only fit 8 bytes)- without an explicite
sizeparameter, no way to determine the length of the array
5 Nested Arrays
T A[R][C]- 2D array to data type
T Rrows,Ccolumns- What's the array's total size?
R*C*sizeof(T)
- 2D array to data type
- single contigious block of memory
- stored in row-major order
- all elements in row 0, followed by all elements in row 1, etc.
- address of row
iisA + i*(C * sizeof(T))
int sea[4][5] = {{ 9, 8, 1, 9, 5}, { 9, 8, 1, 0, 5}, { 9, 8, 1, 0, 3}, { 9, 8, 1, 1, 5}}
https://docs.google.com/spreadsheets/d/17HGr47X1Q8EqkFmZ4Fv8Y54mO8o-wIPaabQBf_bORZ8/edit?usp=sharing
int* get_sea_zip(int index) { return sea[index]; }
get_sea_zip: leaq (%rdi,%rdi,4), %rax # 5 * index leaq sea(,%rax,4), %rax # sea + 20 * index ret sea: .long 9 .long 8 .long 1 .long 9 .long 5 ...
A[i][j]to access an individual element of a nested array- the address works out to
A + i*(C*sizeof(T)) + j*sizeof(T) == A + (i*C + j)*sizeof(T)
- the address works out to
int get_sea_digit (int index, int digit) { return sea[index][digit]; }
get_sea_digit: leaq (%rdi,%rdi,4), %rax # 5 * index addl %rax, %rsi # 5 * index + digit movl sea(,%rsi,4), %eax # *(sea + 4 * (5 * index + digit))