CS 208 w20 lecture 13 outline

1 Warmup

%rsp is 0x100, stack frame holds local array of 4 ints, where is return address? (0x110)

2 Compilation

zoom out a little from assembly to look briefly at the whole compilation process
code in files p1.c, p2.c
compile with command: gcc -Og p1.c p2.c -o p
- put resulting machine code in file p
- run with command: ./p

2.1 Producing Machine Language

simple cases: arithmetic and logical operations, shifts, etc.
- all necessary information is contained in the instruction itself
but consider
- conditional jump
- accessing static data
- call
addresses and labels are a problem because we don't have the final executable yet

3 Array Basics

T A[N] means an array of N elements of type T
contiguously allocated region—how big in terms of N?
A is a pointer (T*) to the start of the array

3.1 Array Access

int x[5] = {3, 7, 1, 9, 5};
indexes 0, 1, 2, 3, 4
addresses a, a+4, a+8, a+12, a+16, a+20 (at the end)

Expression	Type	Value
`x[4]`	`int`	`5`
`x`	`int*`	`a`
`x + 1`	`int*`	`a+4`
`&x[2]`	`int*`	`a+8`
`x[5]`	`int`	?? (whatever's there in memory)
`*(x + 1)`	`int`	`7`
`x + i`	`int*`	`a + 4*i`

3.2 Pointer Arithmetic

C allows pointer arithmetic where the result is scaled according to the size of the data type referenced by the pointer
array subscripting is the combination of pointer arithmetic and dereference (e.g., A[i] is equivalent to *(A+i))
int *nums and int nums[] are nearly identical declarations
- subtle differences include initialization, sizeof
an array name is an expression (not a variable) that returns the address of the array
- it looks like a pointer to the first (0th) element
  - *ar same as ar[0], *(ar+2) same as ar[2]
an array name is read‐only (no assignment) because it is a label
- cannot use ar = <anything>

int get_digit(int z[5], int digit) {
    return z[digit];
}

get_digit:
        movq    (%rdi,%rsi,4), %rax
        ret

3.3 Exercise

for (long i = 0; i < size; i++) {
    total += arr[i];
}

Register	Use
`%rdi`	`arr`
`%rsi`	`size`
`%rdx`	`i`
`%rax`	`total`

init:
        movl $0, %edx
        jmp  test
body:
        addl (%rdi, %rdx, 4), %eax
        addq $1, %rdx
test:
        cmpq %rsi, %rdx
        jl   body

4 Arrays and Functions

arrays declared as local variables are allocated in the current stack frame

char* foo() {
    char string[32]; ...;
    return string;
}

the above code is broken, future function calls will overwrite string (draw stack frame)
an array is passed to a function as a pointer—this means the size gets lost!

int foo(int arr[], unsigned int size) {
    ... arr[size - 1] ...
}

arr is really an int* (%rdi can only fit 8 bytes)
without an explicite size parameter, no way to determine the length of the array

5 Nested Arrays

T A[R][C]
- 2D array to data type T
- R rows, C columns
- What's the array's total size? R*C*sizeof(T)
single contigious block of memory
stored in row-major order
- all elements in row 0, followed by all elements in row 1, etc.
- address of row i is A + i*(C * sizeof(T))

int sea[4][5] = 
  {{ 9, 8, 1, 9, 5},
   { 9, 8, 1, 0, 5},
   { 9, 8, 1, 0, 3},
   { 9, 8, 1, 1, 5}}

https://docs.google.com/spreadsheets/d/17HGr47X1Q8EqkFmZ4Fv8Y54mO8o-wIPaabQBf_bORZ8/edit?usp=sharing

int* get_sea_zip(int index) {
    return sea[index];
}

get_sea_zip:
        leaq    (%rdi,%rdi,4), %rax  # 5 * index
        leaq    sea(,%rax,4), %rax   # sea + 20 * index
        ret
sea:
        .long   9
        .long   8
        .long   1
        .long   9
        .long   5
        ...

A[i][j] to access an individual element of a nested array
- the address works out to A + i*(C*sizeof(T)) + j*sizeof(T) == A + (i*C + j)*sizeof(T)

int get_sea_digit (int index, int digit) {
    return sea[index][digit];
}

get_sea_digit:
        leaq (%rdi,%rdi,4), %rax  # 5 * index
        addl %rax, %rsi           # 5 * index + digit
        movl sea(,%rsi,4), %eax   # *(sea + 4 * (5 * index + digit))