CS 208 s22 — Network Programming

1. Client-Server Transaction
2. Global IP Internet
- 2.1. Programmer's View
3. Anatomy of a Connection
- 3.1. Sockets Interface
  - 3.1.1. Echo Server Example
4. Web Servers

ARPANET 1970:

These days: >1 billion internet hosts (https://www.statista.com/statistics/264473/number-of-internet-hosts-in-the-domain-name-system/).

1 Client-Server Transaction

Most network applications are based on the client-server model:
- A server process and one or more client processes
- Server manages some resource
- Server provides service by manipulating resource for clients
- Server activated by request from client (think putting an order into a vending machine analogy)

Data is sent over the network in units called packets
A packet consists of a payload (the content the client is sending to the server or vice versa) and metadata to tell the network where to send the packet
How packets are sent and received is determined by the network protocol

2 Global IP Internet

An internet is an interconnected set of networks
- Carleton campus is an example of a network, and it has connections to other networks (i.e., it is connected to the Global IP Internet)
The Global IP Internet is based on the TCP/IP protocol family
- IP (Internet Protocol)
  - Provides basic naming scheme and unreliable delivery capability of packets (datagrams) from host-to-host
- UDP (Unreliable Datagram Protocol)
  - Uses IP to provide unreliable datagram delivery from process-to-process
- TCP (Transmission Control Protocol)
  - Uses IP to provide reliable byte streams from process-to-process over connections

2.1 Programmer's View

Hosts are mapped to a set of 32-bit IP addresses¹
- 128.2.203.179
- 127.0.0.1 (always the local machine, or localhost)
The set of IP addresses is mapped to a set of identifiers called Internet domain names
- 35.227.227.189 is mapped to www.carleton.edu
By convention, each byte in a 32-bit IP address is represented by its decimal value and separated by a period
- IP address: 0x8002C2F2 = 128.2.194.242
A process on one Internet host can communicate with a process on another Internet host over a connection

3 Anatomy of a Connection

Clients and servers communicate by sending streams of bytes over connections. Each connection is:
- Point-to-point: connects a pair of processes.
- Full-duplex: data can flow in both directions at the same time,
- Reliable: stream of bytes sent by the source is eventually received by the destination in the same order it was sent.
A socket is an endpoint of a connection
- Socket address is an IPaddress:port pair
A port is a 16-bit integer that identifies a process:
- Ephemeral port: Assigned automatically by client kernel when client makes a connection request.
- Well-known port: Associated with some service provided by a server (e.g., port 80 is associated with Web servers)
A connection is uniquely identified by the socket addresses of its endpoints (socket pair)
- (cliaddr:cliport, servaddr:servport)

A client can use the port to indicate what service (process) they are sending a request to:

3.1 Sockets Interface

Set of system-level functions used in conjunction with Unix I/O to build network applications.
Created in the early 80's as part of the original Berkeley distribution of Unix that contained an early version of the Internet protocols.
Available on all modern systems
- Unix variants, Windows, OS X, IOS, Android, ARM
What is a socket?
- To the kernel, a socket is an endpoint of communication
- To an application, a socket is a file descriptor that lets the application read/write from/to the network
- Remember: All Unix I/O devices, including networks, are modeled as files
Clients and servers communicate with each other by reading from and writing to socket descriptors

The main distinction between regular file I/O and socket I/O is how the application "opens" the socket descriptors

3.1.1 Echo Server Example

Consists of an echo server and client

Server
- Accepts connection request
- Repeats back lines as they are typed
Client
Requests connection to server
Repeatedly:
- Read line from terminal
- Send to server
- Read reply from server
- Print line to terminal

For a more detailed version²

Echo client:

#include "csapp.h"

int main(int argc, char **argv)
{
    int clientfd;
    char *host, *port, buf[MAXLINE];
    rio_t rio;

    host = argv[1];
    port = argv[2];

    clientfd = Open_clientfd(host, port); // open a connection to host:port
    Rio_readinitb(&rio, clientfd); // initialize rio struct to be ready for buffered reading from clientfd

    while (Fgets(buf, MAXLINE, stdin) != NULL) { // read in up to MAXLINE characters from stdin
        Rio_writen(clientfd, buf, strlen(buf)); // write that string to the socket connected to the server
        Rio_readlineb(&rio, buf, MAXLINE); // read a line from the socket
        Fputs(buf, stdout); // print what we read to stdout
    }
    Close(clientfd); 
    exit(0);
}

Echo server, iterative main routine:

#include "csapp.h"
void echo(int connfd);

int main(int argc, char **argv)
{
    int listenfd, connfd;
    socklen_t clientlen;
    struct sockaddr_storage clientaddr; /* Large enough to accommodate all supported protocol-specific address structures */
    char client_hostname[MAXLINE], client_port[MAXLINE];

    // create a listening socket on the port passed in as a command-line argument
    listenfd = Open_listenfd(argv[1]);
    // loop forever, accepting connections from clients
    while (1) {
        clientlen = sizeof(struct sockaddr_storage); /* Need to tell Accept how big the clientaddr struct is */
        connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);

        // use getnameinfo library function to convert from clientaddr struct
        // to hostname and port strings
        Getnameinfo((SA *) &clientaddr, clientlen, 
                    client_hostname, MAXLINE, client_port, MAXLINE, 0);
        printf("Connected to (%s, %s)\n", client_hostname, client_port);

        // call echo function, passing file descriptor for the socket connected to client
        echo(connfd);
        // close the connection
        Close(connfd);
    }
    exit(0);
}

Echo server, echo function:

void echo(int connfd)
{
    size_t n;
    char buf[MAXLINE];
    rio_t rio;

    Rio_readinitb(&rio, connfd); // initialize the rio struct for buffered reading from connfd
    while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) { // read one line of up to MAXLINE characters from client
        printf("server received %d bytes\n", (int)n);
        Rio_writen(connfd, buf, n); // write the input read from client back to the client
    }
}

4 Web Servers

Clients and servers communicate using the HyperText Transfer Protocol (HTTP)
- Client and server establish TCP connection
- Client requests content
- Server responds with requested content
- Client and server close connection (eventually)

Web servers return content to clients
- content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type
- Content is identified by its URL (Uniform Resource Locator)

Example MIME types:

MINE type	meaning
text/html	HTML document
text/plain	Unformatted text
image/gif	Binary image encoded in GIF format
image/png	Binary image encoded in PNG format
image/jpeg	Binary image encoded in JPEG format

4.1 URLs

Unique name for a file: URL (Universal Resource Locator)
Example URL: http://www.carleton.edu:80/index.html
Clients use prefix (http://www.carleton.edu:80) to infer:
- What kind (protocol) of server to contact (HTTP)
- Where the server is (www.carleton.edu)
- What port it is listening on (80)
Servers use suffix (/index.html) to:
- Determine if request is for static or dynamic content.
  - No hard and fast rules for this
  - One convention: executables reside in cgi-bin directory
Find file on file system
- Initial "/" in suffix denotes home directory for requested content.
- Minimal suffix is "/", which server expands to configured default filename (usually, index.html)

4.2 HTTP Requests

HTTP request is a request line, followed by zero or more request headers
Request line: <method> <uri> <version>
- <method> is one of GET, POST, OPTIONS, HEAD, PUT, DELETE, or TRACE
- <uri> is the content being requested
  - A URL is a type of URI (Uniform Resource Identifier)
- <version> is HTTP version of request (HTTP/1.0 or HTTP/1.1)
Request headers: <header name>: <header data>
- Provide additional information to the server
A blank line ("\r\n") indicates the end of the request

4.3 HTTP Responses

HTTP response is a response line followed by zero or more response headers, possibly followed by content, with blank line ("\r\n") separating headers from content.
Response line: <version> <status code> <status msg>
- <version> is HTTP version of the response
- <status code> is numeric status
- <status msg> is corresponding English text
  
  code msg description
  
  200 OK Request was handled without error
  
  301 Moved Provide alternate URL
  
  404 Not found Server couldn’t find the file
Response headers: <header name>: <header data>
- Provide additional information about response
- Content-Type: MIME type of content in response body
- Content-Length: Length of content in response body

`code`	`msg`	description
`200`	`OK`	Request was handled without error
`301`	`Moved`	Provide alternate URL
`404`	`Not found`	Server couldn’t find the file

4.4 Proxies

A proxy is an intermediary between a client and an origin server
- To the client, the proxy acts like a server
- To the server, the proxy acts like a client

Can perform useful functions as requests and responses pass by
- Examples: Caching, logging, anonymization, filtering, transcoding

Footnotes:

The original Internet Protocol, with its 32-bit addresses, is known as Internet Protocol Version 4 (IPv4)
1996: Internet Engineering Task Force (IETF) introduced Internet Protocol Version 6 (IPv6) with 128-bit addresses
- Intended as the successor to IPv4
Majority of Internet traffic still carried by IPv4 (https://www.google.com/intl/en/ipv6/statistics.html)
We will focus on IPv4, but will show you how to write networking code that is protocol-independent.

open_clientfd and open_listenfd use the collection of sockets system calls to set up a client or a listening socket. We don't have time to go through this interface in detail—the specific calls are shown in the diagram below, and the textbook has more information.