CS 208 s21 — Introduction to x86-64 Assembly

Table of Contents

You can access a pdf of the slides here.

1 Introduction

  • We’ve covered how and where data is stored and how numbers are represented
  • Starting today: what operations does CPU use to actually execute your program
  • Start of digital programmable computers
    • Colossus in the 1940s (code breaking)
    • manual input of machine code
    • later that decade saw the first assembly languages, providing text representation of machine code
  • In the decades since, computer scientists have created layers of abstraction
    • much nicer to program in C than machine code
  • why study assembly—shouldn’t we stand on the shoulders of giants?
    • understand optimizations
    • understand exactly how data is accessed
    • prevent malware

2 Assembly Programming

  • Build up assembly programming picture
    • CPU
      • PC, registers
    • Memory
      • Code, data
      • remember: memory is one huge array of bytes, each with unique address
    • no types, just contiguous bytes of binary data
    • Incomplete, we’ll add pieces in future lectures
    • Just as the details of this picture are dependent on the hardware, the assembly language operating on it will be specific to the hardware
      • instruction set architecture
  • Registers
    • Small amount of data that CPU can access extremely quickly
    • Unlike locations in memory, registers have names not addresses
      • being with % (%rdi)
    • slide with 16 x86+64 registers, %rsp is reserved
      • not necessary to memorize all these names, listed here for reference
    • memory vs registers
      • addresses vs names
      • big (8GB) vs small (16 x 8B)
      • slow (50ns) vs fast (<1ns)
      • dynamic vs static

registers.png

2.1 Moving Data

  • Moving data is a fundamental operation
    • mov_ src, dest
      • missing letter _ specifies size of data
        • b(yte) = 1 byte, w(ord) = 2 bytes, l(ong word) = 4 bytes, q(uad word) = 8 bytes
        • “word” is 2 bytes (16 bits) to be backwards compatible with 8086 programs (16-bit predecessor to x86 hardware)
      • operand types
        • immediate (constant integer value)
          • $0x400, $-533
        • register (name of 1 of 16 registers)
          • %rax, %r13
        • memory (consecutive bytes of memory starting at given address)
          • (%rax), full form is D(reg_base, reg_index, s)
            • refers to memory address reg_base + reg_index * s + D
            • D and s are immediate values, s can only be 1, 2, 4, or 8 (why?)
            • Various components can be omitted see book section 3.4.1
        • cannot mov memory to memory—how would you do it?