CS 208 w20 lecture 13 outline

1 Warmup

%rsp is 0x100, stack frame holds local array of 4 ints, where is return address? (0x110)

2 Compilation

  • zoom out a little from assembly to look briefly at the whole compilation process
  • code in files p1.c, p2.c
  • compile with command: gcc -Og p1.c p2.c -o p
    • put resulting machine code in file p
    • run with command: ./p

compilation.png

2.1 Producing Machine Language

  • simple cases: arithmetic and logical operations, shifts, etc.
    • all necessary information is contained in the instruction itself
  • but consider
    • conditional jump
    • accessing static data
    • call
  • addresses and labels are a problem because we don't have the final executable yet

linking.png

3 Array Basics

  • T A[N] means an array of N elements of type T
  • contiguously allocated region—how big in terms of N?
  • A is a pointer (T*) to the start of the array

array-types.png

3.1 Array Access

  • int x[5] = {3, 7, 1, 9, 5};
  • indexes 0, 1, 2, 3, 4
  • addresses a, a+4, a+8, a+12, a+16, a+20 (at the end)
Expression Type Value
x[4] int 5
x int* a
x + 1 int* a+4
&x[2] int* a+8
x[5] int ?? (whatever's there in memory)
*(x + 1) int 7
x + i int* a + 4*i

3.2 Pointer Arithmetic

  • C allows pointer arithmetic where the result is scaled according to the size of the data type referenced by the pointer
  • array subscripting is the combination of pointer arithmetic and dereference (e.g., A[i] is equivalent to *(A+i))
  • int *nums and int nums[] are nearly identical declarations
    • subtle differences include initialization, sizeof
  • an array name is an expression (not a variable) that returns the address of the array
    • it looks like a pointer to the first (0th) element
      • *ar same as ar[0], *(ar+2) same as ar[2]
  • an array name is read‐only (no assignment) because it is a label
    • cannot use ar = <anything>
int get_digit(int z[5], int digit) {
    return z[digit];
}
get_digit:
        movq    (%rdi,%rsi,4), %rax
        ret

3.3 Exercise

for (long i = 0; i < size; i++) {
    total += arr[i];
}
Register Use
%rdi arr
%rsi size
%rdx i
%rax total
init:
        movl $0, %edx
        jmp  test
body:
        addl (%rdi, %rdx, 4), %eax
        addq $1, %rdx
test:
        cmpq %rsi, %rdx
        jl   body

4 Arrays and Functions

  • arrays declared as local variables are allocated in the current stack frame
char* foo() {
    char string[32]; ...;
    return string;
}
  • the above code is broken, future function calls will overwrite string (draw stack frame)
  • an array is passed to a function as a pointer—this means the size gets lost!
int foo(int arr[], unsigned int size) {
    ... arr[size - 1] ...
}
  • arr is really an int* (%rdi can only fit 8 bytes)
  • without an explicite size parameter, no way to determine the length of the array

5 Nested Arrays

  • T A[R][C]
    • 2D array to data type T
    • R rows, C columns
    • What's the array's total size? R*C*sizeof(T)
  • single contigious block of memory
  • stored in row-major order
    • all elements in row 0, followed by all elements in row 1, etc.
    • address of row i is A + i*(C * sizeof(T))
int sea[4][5] = 
  {{ 9, 8, 1, 9, 5},
   { 9, 8, 1, 0, 5},
   { 9, 8, 1, 0, 3},
   { 9, 8, 1, 1, 5}}

https://docs.google.com/spreadsheets/d/17HGr47X1Q8EqkFmZ4Fv8Y54mO8o-wIPaabQBf_bORZ8/edit?usp=sharing

int* get_sea_zip(int index) {
    return sea[index];
}
get_sea_zip:
        leaq    (%rdi,%rdi,4), %rax  # 5 * index
        leaq    sea(,%rax,4), %rax   # sea + 20 * index
        ret
sea:
        .long   9
        .long   8
        .long   1
        .long   9
        .long   5
        ...
  • A[i][j] to access an individual element of a nested array
    • the address works out to A + i*(C*sizeof(T)) + j*sizeof(T) == A + (i*C + j)*sizeof(T)
int get_sea_digit (int index, int digit) {
    return sea[index][digit];
}
get_sea_digit:
        leaq (%rdi,%rdi,4), %rax  # 5 * index
        addl %rax, %rsi           # 5 * index + digit
        movl sea(,%rsi,4), %eax   # *(sea + 4 * (5 * index + digit))