CS 208 w20 lecture 15 outline

1 Review

#include <stdlib.h>
struct s {
    char c;
    int a[4];
    char *p;
};

int get_elem(struct s *sp, long index) {
    return sp->a[index];
}

int main() {
    struct s *sp = malloc(sizeof(struct s));
    return get_elem(sp, 3);
}

Which instruction (plus ret) implements getelem?

2 Background

2.1 Stack Frame Review

  • In x86-64 Linux
    • stack segment of memory starts at 0x00007fffffffffff and grows down
    • code segment of memory starts at 0x400000 and grows up

stack-frame.png

2.2 What is a Buffer?

  • array used to temporarily store data
  • video buffering is the video being written to a buffer before being played
  • often used to store user input

2.3 What is a Buffer Overflow?

  • arrays can be stored on the stack alongside procedure data like the return address
  • C does not prevent writing to elements beyond the end of an array
  • together, these two facts allow for a buffer overflow where program state on the stack is corrupted
    • for example, overwritting the return address pushed on the stack by the caller would cause the program to jump to an unexpected or invalid place

no-overflow.png

overflow.png

  • attacker just has to choose the right inputs to overwrite interesting data
  • simple attack: overwrite the current return address (sometimes called stack smashing)
  • for a long time this was the #1 technical cause of security vulnerabilities
    • #1 overall cause is pretty much always humans (social engineering, ignorance, etc.)

2.3.1 Example

/* Get string from stdin */
char* gets(char* dest) {
    int c = getchar();
    char* p = dest;
    while (c != EOF && c != '\n') {
        *p++ = c;
        c = getchar();
    }
    *p = '\0';
    return dest;
}
  • what could go wrong here?

2.3.2 gets Known to be Harmful

  • bugs section of gets man page
  • also a problem with strcpy, scanf, fsnanf, sscanf
  • gcc: warning: the `gets' function is dangerous and should not be used.

3 Buffer Overflow In Action

Consider this (very insecure) code:

/* Echo Line */
void echo() {
    char buf[8];  /* Way too small! */
    gets(buf);
    puts(buf);
}

void call_echo() {
    echo();
}
  • entering 01234567890123456789012 works fine
  • entering 012345678901234567890123 causes an illeagal instruction error
  • entering 0123456789012345678901234 causes a segmentation fault

From buf-nsp.d:

0000000000400566 <echo>:
  400566:       48 83 ec 18             sub    $0x18,%rsp
  40056a:       48 89 e7                mov    %rsp,%rdi
  40056d:       b8 00 00 00 00          mov    $0x0,%eax
  400572:       e8 d9 fe ff ff          callq  400450 <gets@plt>
  400577:       48 89 e7                mov    %rsp,%rdi
  40057a:       e8 b1 fe ff ff          callq  400430 <puts@plt>
  40057f:       48 83 c4 18             add    $0x18,%rsp
  400583:       c3                      retq

0000000000400584 <call_echo>:
  400584:       48 83 ec 08             sub    $0x8,%rsp
  400588:       b8 00 00 00 00          mov    $0x0,%eax
  40058d:       e8 d4 ff ff ff          callq  400566 <echo>
  400592:       48 83 c4 08             add    $0x8,%rsp
  400596:       c3                      retq

Spreadsheet example

4 Code Injection Attack

code-injection.png

  • very common attack to get a program to execute an arbitrary function
    • over a network, program given a string containing executable code (exploit code) with extra data to overwrite a return address with the location of the exploit
    • exploit might use a system call to start a shell giving the attacker access to the system
    • exploit might do some mischief, then repair the stack and call ret again, giving the appearance of normal behavior

4.1 Poll

vulnerable:
    subq  $0x40, %rsp
     ...
    leaq  0x10(%rsp), %rdi
    call  gets
     ...

What is the minimum number of characters that gets must read in order for us to change the return address to a stack address?

For example, change 0x00 00 00 00 00 40 05 D1 to 0x00 00 7F FF CA FE F0 0D

4.2 Real World examples

  • buffer overflow exploits are alarmingly common in real programs
    • programmers keep making the same mistakes
    • recent innovations have improved the situation

4.2.1 Internet Worm (1988)

  • protocol for getting the status of a server (fingerd) used gets to read its argument
  • worm sent exploit code that executed a root shell on the target machine
  • scanned other machines to attack, invaded about 6000 computers in hours (10% of the Internet at that time)
  • author (Robert Morris) was first person ever convicted under the Computer Fraud and Abuse Act, now faculty at MIT

4.2.2 Heartbleed (2014)

heartbleed-explanation.png

  • affected Tumblr, Google, Yahoo, Intuit (makers of TurboTax), Dropbox, Netflix, Facebook, and many, many smaller sites

4.2.3 Hacking Cars

  • in 2010, UW researchers demonstrated wirelessly hacking a car using buffer overflow
  • overwrote the onboard control system’s code
    • disable brakes
    • unlock doors
    • turn engine on/off

4.2.4 Hacking DNA Sequencing Machines

  • in 2017, security researchers demonstrated that a buffer overflow exploit could be encoded in DNA
  • when read by vulnerable sequencing software, the attack could compromise the sequencing machine

5 Countermeasures

5.1 System Level

5.1.1 Non-Executable Stack

x86-64 added execute permission (not all systems have hardware support, doesn't block all exploits)

5.1.2 Stack Randomization

  • in the past, stack addresses were highly predictable, meaning if an attacker could determine addresses for a common web server, than many machines were vulnerable
  • make it unpredictable by allocating between 0 and \(n\) bytes on the stack at the start of the program
  • part of a larger class of techniques called address-space layout randomization (ASLR)
  • in general, this randomization can greatly increase the effort required for a successful attack, but cannot guarantee safety

5.2 Writing Better Code

fgets, strncpy, avoid %s use safer language

5.2.1 Stack Corruption Detection

  • detect when stack corruption occurs before it can have harmful effects
  • gcc now uses stack protectors to detect buffer overflows
    • canary value (or guard value) between buffer and rest of the stack
    • generated each time the program runs, so hard for attacker to know what it will be
    • stored in a read-only segment of memory
  • prevents many common attack strategies