CS 208 f21 — Arithmetic in x86-64 Assembly

Table of Contents

1 Arithmetic Instructions

Instruction Description Effect
inc \(D\) \(D + 1 \rightarrow D\) increment
dec \(D\) \(D - 1 \rightarrow D\) decrement
neg \(D\) \(-D \rightarrow D\) negate
not \(D\) \(~D \rightarrow D\) complement
     
add \(S,\:D\) \(D + S \rightarrow D\) add
sub \(S,\:D\) \(D - S \rightarrow D\) subtract
imul \(S,\:D\) \(D * S \rightarrow D\) multiply
xor \(S,\:D\) \(D\,\widehat{}\,S \rightarrow D\) exclusive-or
or \(S,\:D\) \(D\,\vert\,S \rightarrow D\) or
and \(S,\:D\) \(D\,\&\,S \rightarrow D\) and
     
sal \(k,\:D\) \(D\) << \(k \rightarrow D\) left shift
shl \(k,\:D\) \(D\) << \(k \rightarrow D\) left shift (same as sal)
sar \(k,\:D\) \(D\) >> \(k \rightarrow D\) arithmetic right shift
shr \(k,\:D\) \(D\) >> \(k \rightarrow D\) logical right shift

2 Thinking in Assembly

2.1 Assembly to C

A C function with the signature long f(long *p, long i) compiled to the following assembly code:

f:
    movq    %rsi, %rax
    addq    (%rdi), %rax
    movq    %rax, (%rdi)
    ret
Register Use
%rdi 1st argument (p)
%rsi 2nd argument (i)

Possible C code for this function:

long f(long *p, long i) {
    long return_value = i;  // move %rsi to %rax (%rax holds the return value for a function)
    return_value += *p;     // dereference the pointer p, add that to %rax
    *p = return_value;      // move the value in %rax to the memory location p points to
    return return_value;    // return (using %rax as the return value)
}

A more concise version that has the same effect:

long f(long *p, long i) {
    *p += i;
    return *p;
}

2.2 leaq Instruction

  • "load effective address", but more often "lovely efficient arithmetic"
  • instead of reading from the memory location given by the source operand, copies the effective address to the destination
    • generate pointers for later memory references
    • can also do a muliply and an addition in a single instruction
      • leaq 7(%rdx, %rdx, 4), %rax will set %rax equal to 5 * %rdx + 7
  • destination must be a register
  • must have the q size designation on a 64-bit system
    • lea specifically works with a memory addresses, which will always by 8 bytes on a 64-bit system
  • movq %rdx, %rax vs movq (%rdx), %rax vs leaq (%rdx), %rax
    • rdx holds 0x100, memory address 0x100 holds 0xab

3 Review

Write down the difference between leaq and movq 1

Given 0xf000 in %rdx, 0x0100 in %rcx (omitting additional leading zeros), what memory addresses do the following operands access?2

  • 0x8(%rdx)
  • (%rdx,%rcx)
  • (%rdx,%rcx,4)
  • 0x80(,%rdx,2)

4 C to Assembly

Translate this C code to assembly3

long arith(long x, long y, long z)
{
    long t1 = x + y;
    long t2 = z + t1;
    long t3 = x + 4;
    long t4 = y * 48;
    long t5 = t3 + t4;
    long rval = t2 * t5;
    return rval;
}
Register Use
%rdi 1st argument (x)
%rsi 2nd argument (y)
%rdx 3rd argument (z)

5 Practice

  • Do CSPP practice problems 3.6 (p. 192), 3.7 (p. 193), 3.10 (p. 196), and 3.11 (p. 197)
    • To do the comparison for 3.11 part C, you can write an assembly file containing both instructions (e.g., xor_test.s), compile it to an object file (xor_test.o) and use objdump to print out the bytes. See section 3.2.2 of CSPP for an example.

Footnotes:

1

mov copies the source to the destination, possibly computing a memory address and reading or writing a value there, while lea computes a memory address and stores that address in a register (instead of going to memory)

2
  • 0x8(%rdx) accesses 0xf008
  • (%rdx,%rcx) accesses 0xf100
  • (%rdx,%rcx,4) accesses 0xf400
  • 0x80(,%rdx,2) accesses 0x1e080
3

One possible assembly implementation of arith:

arith:
    leaq    (%rdi,%rsi), %rax    // performs x + y using leaq
    addq    %rdx, %rax           // %rax now holds x + y + z (t2)
    leaq    (%rsi,%rsi,2), %rcx  // performs y * 3 using leaq
    salq    $4, %rcx             // %rcx now holds y * 48, since left shift 4 multiplies by 2^4 (16), and y * 3 * 16 == y * 48
    leaq    4(%rdi,%rcx), %rcx   // %rcx now holds y * 48 + x + 4 (t5 = t3 + t4)
    imulq   %rcx, %rax           // %rax now holds t2 * t5
    ret                          // return using %rax as the return value