CS 208 w20 lecture 13 outline
1 Warmup
%rsp
is 0x100
, stack frame holds local array of 4 ints, where is return address? (0x110
)
2 Compilation
- zoom out a little from assembly to look briefly at the whole compilation process
- code in files
p1.c
,p2.c
- compile with command:
gcc -Og p1.c p2.c -o p
- put resulting machine code in file
p
- run with command: ./p
- put resulting machine code in file
2.1 Producing Machine Language
- simple cases: arithmetic and logical operations, shifts, etc.
- all necessary information is contained in the instruction itself
- but consider
- conditional jump
- accessing static data
call
- addresses and labels are a problem because we don't have the final executable yet
3 Array Basics
T A[N]
means an array ofN
elements of typeT
- contiguously allocated region—how big in terms of
N
? A
is a pointer (T*
) to the start of the array
3.1 Array Access
int x[5] = {3, 7, 1, 9, 5};
- indexes
0
,1
,2
,3
,4
- addresses
a
,a+4
,a+8
,a+12
,a+16
,a+20
(at the end)
Expression | Type | Value |
---|---|---|
x[4] |
int |
5 |
x |
int* |
a |
x + 1 |
int* |
a+4 |
&x[2] |
int* |
a+8 |
x[5] |
int |
?? (whatever's there in memory) |
*(x + 1) |
int |
7 |
x + i |
int* |
a + 4*i |
3.2 Pointer Arithmetic
- C allows pointer arithmetic where the result is scaled according to the size of the data type referenced by the pointer
- array subscripting is the combination of pointer arithmetic and dereference (e.g.,
A[i]
is equivalent to*(A+i)
) int *nums
andint nums[]
are nearly identical declarations- subtle differences include initialization,
sizeof
- subtle differences include initialization,
- an array name is an expression (not a variable) that returns the address of the array
- it looks like a pointer to the first (0th) element
*ar
same asar[0]
,*(ar+2)
same asar[2]
- it looks like a pointer to the first (0th) element
- an array name is readâonly (no assignment) because it is a label
- cannot use
ar = <anything>
- cannot use
int get_digit(int z[5], int digit) { return z[digit]; }
get_digit: movq (%rdi,%rsi,4), %rax ret
3.3 Exercise
for (long i = 0; i < size; i++) { total += arr[i]; }
Register | Use |
---|---|
%rdi |
arr |
%rsi |
size |
%rdx |
i |
%rax |
total |
init: movl $0, %edx jmp test body: addl (%rdi, %rdx, 4), %eax addq $1, %rdx test: cmpq %rsi, %rdx jl body
4 Arrays and Functions
- arrays declared as local variables are allocated in the current stack frame
char* foo() { char string[32]; ...; return string; }
- the above code is broken, future function calls will overwrite
string
(draw stack frame) - an array is passed to a function as a pointer—this means the size gets lost!
int foo(int arr[], unsigned int size) { ... arr[size - 1] ... }
arr
is really anint*
(%rdi
can only fit 8 bytes)- without an explicite
size
parameter, no way to determine the length of the array
5 Nested Arrays
T A[R][C]
- 2D array to data type
T
R
rows,C
columns- What's the array's total size?
R*C*sizeof(T)
- 2D array to data type
- single contigious block of memory
- stored in row-major order
- all elements in row 0, followed by all elements in row 1, etc.
- address of row
i
isA + i*(C * sizeof(T))
int sea[4][5] = {{ 9, 8, 1, 9, 5}, { 9, 8, 1, 0, 5}, { 9, 8, 1, 0, 3}, { 9, 8, 1, 1, 5}}
https://docs.google.com/spreadsheets/d/17HGr47X1Q8EqkFmZ4Fv8Y54mO8o-wIPaabQBf_bORZ8/edit?usp=sharing
int* get_sea_zip(int index) { return sea[index]; }
get_sea_zip: leaq (%rdi,%rdi,4), %rax # 5 * index leaq sea(,%rax,4), %rax # sea + 20 * index ret sea: .long 9 .long 8 .long 1 .long 9 .long 5 ...
A[i][j]
to access an individual element of a nested array- the address works out to
A + i*(C*sizeof(T)) + j*sizeof(T) == A + (i*C + j)*sizeof(T)
- the address works out to
int get_sea_digit (int index, int digit) { return sea[index][digit]; }
get_sea_digit: leaq (%rdi,%rdi,4), %rax # 5 * index addl %rax, %rsi # 5 * index + digit movl sea(,%rsi,4), %eax # *(sea + 4 * (5 * index + digit))