CS 208 f21 — Network Programming
Table of Contents
ARPANET 1970:
These days: >1 billion internet hosts (https://www.statista.com/statistics/264473/number-of-internet-hosts-in-the-domain-name-system/).
1 Client-Server Transaction
- Most network applications are based on the client-server model:
- A server process and one or more client processes
- Server manages some resource
- Server provides service by manipulating resource for clients
- Server activated by request from client (think putting an order into a vending machine analogy)
- Data is sent over the network in units called packets
- A packet consists of a payload (the content the client is sending to the server or vice versa) and metadata to tell the network where to send the packet
- How packets are sent and received is determined by the network protocol
2 Global IP Internet
- An internet is an interconnected set of networks
- Carleton campus is an example of a network, and it has connections to other networks (i.e., it is connected to the Global IP Internet)
- The Global IP Internet is based on the TCP/IP protocol family
- IP (Internet Protocol)
- Provides basic naming scheme and unreliable delivery capability of packets (datagrams) from host-to-host
- UDP (Unreliable Datagram Protocol)
- Uses IP to provide unreliable datagram delivery from process-to-process
- TCP (Transmission Control Protocol)
- Uses IP to provide reliable byte streams from process-to-process over connections
- IP (Internet Protocol)
2.1 Programmer's View
- Hosts are mapped to a set of 32-bit IP addresses1
128.2.203.179
127.0.0.1
(always the local machine, or localhost)
- The set of IP addresses is mapped to a set of identifiers called Internet domain names
35.227.227.189
is mapped towww.carleton.edu
- By convention, each byte in a 32-bit IP address is represented by its decimal value and separated by a period
- IP address:
0x
80
02
C2
F2
=128
.2
.194
.242
- IP address:
- A process on one Internet host can communicate with a process on another Internet host over a connection
3 Anatomy of a Connection
- Clients and servers communicate by sending streams of bytes over connections. Each connection is:
- Point-to-point: connects a pair of processes.
- Full-duplex: data can flow in both directions at the same time,
- Reliable: stream of bytes sent by the source is eventually received by the destination in the same order it was sent.
- A socket is an endpoint of a connection
- Socket address is an
IPaddress:port
pair
- Socket address is an
- A port is a 16-bit integer that identifies a process:
- Ephemeral port: Assigned automatically by client kernel when client makes a connection request.
- Well-known port: Associated with some service provided by a server (e.g., port 80 is associated with Web servers)
- A connection is uniquely identified by the socket addresses of its endpoints (socket pair)
- (
cliaddr:cliport
,servaddr:servport
)
- (
A client can use the port to indicate what service (process) they are sending a request to:
3.1 Sockets Interface
- Set of system-level functions used in conjunction with Unix I/O to build network applications.
- Created in the early 80's as part of the original Berkeley distribution of Unix that contained an early version of the Internet protocols.
- Available on all modern systems
- Unix variants, Windows, OS X, IOS, Android, ARM
- What is a socket?
- To the kernel, a socket is an endpoint of communication
- To an application, a socket is a file descriptor that lets the application read/write from/to the network
- Remember: All Unix I/O devices, including networks, are modeled as files
- Clients and servers communicate with each other by reading from and writing to socket descriptors
The main distinction between regular file I/O and socket I/O is how the application "opens" the socket descriptors
3.1.1 Echo Server Example
Consists of an echo server and client
- Server
- Accepts connection request
- Repeats back lines as they are typed
- Client
- Requests connection to server
- Repeatedly:
- Read line from terminal
- Send to server
- Read reply from server
- Print line to terminal
For a more detailed version2
Echo client:
#include "csapp.h" int main(int argc, char **argv) { int clientfd; char *host, *port, buf[MAXLINE]; rio_t rio; host = argv[1]; port = argv[2]; clientfd = Open_clientfd(host, port); // open a connection to host:port Rio_readinitb(&rio, clientfd); // initialize rio struct to be ready for buffered reading from clientfd while (Fgets(buf, MAXLINE, stdin) != NULL) { // read in up to MAXLINE characters from stdin Rio_writen(clientfd, buf, strlen(buf)); // write that string to the socket connected to the server Rio_readlineb(&rio, buf, MAXLINE); // read a line from the socket Fputs(buf, stdout); // print what we read to stdout } Close(clientfd); exit(0); }
Echo server, iterative main routine:
#include "csapp.h" void echo(int connfd); int main(int argc, char **argv) { int listenfd, connfd; socklen_t clientlen; struct sockaddr_storage clientaddr; /* Large enough to accommodate all supported protocol-specific address structures */ char client_hostname[MAXLINE], client_port[MAXLINE]; // create a listening socket on the port passed in as a command-line argument listenfd = Open_listenfd(argv[1]); // loop forever, accepting connections from clients while (1) { clientlen = sizeof(struct sockaddr_storage); /* Need to tell Accept how big the clientaddr struct is */ connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen); // use getnameinfo library function to convert from clientaddr struct // to hostname and port strings Getnameinfo((SA *) &clientaddr, clientlen, client_hostname, MAXLINE, client_port, MAXLINE, 0); printf("Connected to (%s, %s)\n", client_hostname, client_port); // call echo function, passing file descriptor for the socket connected to client echo(connfd); // close the connection Close(connfd); } exit(0); }
Echo server, echo
function:
void echo(int connfd) { size_t n; char buf[MAXLINE]; rio_t rio; Rio_readinitb(&rio, connfd); // initialize the rio struct for buffered reading from connfd while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) { // read one line of up to MAXLINE characters from client printf("server received %d bytes\n", (int)n); Rio_writen(connfd, buf, n); // write the input read from client back to the client } }
4 Web Servers
- Clients and servers communicate using the HyperText Transfer Protocol (HTTP)
- Client and server establish TCP connection
- Client requests content
- Server responds with requested content
- Client and server close connection (eventually)
- Web servers return content to clients
- content: a sequence of bytes with an associated MIME (Multipurpose Internet Mail Extensions) type
- Content is identified by its URL (Uniform Resource Locator)
Example MIME types:
MINE type | meaning |
---|---|
text/html | HTML document |
text/plain | Unformatted text |
image/gif | Binary image encoded in GIF format |
image/png | Binary image encoded in PNG format |
image/jpeg | Binary image encoded in JPEG format |
4.1 URLs
- Unique name for a file: URL (Universal Resource Locator)
- Example URL: http://www.carleton.edu:80/index.html
- Clients use prefix (http://www.carleton.edu:80) to infer:
- What kind (protocol) of server to contact (HTTP)
- Where the server is (www.carleton.edu)
- What port it is listening on (80)
- Servers use suffix (/index.html) to:
- Determine if request is for static or dynamic content.
- No hard and fast rules for this
- One convention: executables reside in cgi-bin directory
- Determine if request is for static or dynamic content.
- Find file on file system
- Initial "/" in suffix denotes home directory for requested content.
- Minimal suffix is "/", which server expands to configured default filename (usually, index.html)
4.2 HTTP Requests
- HTTP request is a request line, followed by zero or more request headers
- Request line:
<method> <uri> <version>
<method>
is one ofGET
,POST
,OPTIONS
,HEAD
,PUT
,DELETE
, orTRACE
<uri>
is the content being requested- A URL is a type of URI (Uniform Resource Identifier)
<version>
is HTTP version of request (HTTP/1.0
orHTTP/1.1
)
- Request headers:
<header name>: <header data>
- Provide additional information to the server
- A blank line (
"\r\n"
) indicates the end of the request
4.3 HTTP Responses
- HTTP response is a response line followed by zero or more response headers, possibly followed by content, with blank line (
"\r\n"
) separating headers from content. - Response line:
<version> <status code> <status msg>
<version>
is HTTP version of the response<status code>
is numeric status<status msg>
is corresponding English textcode
msg
description 200
OK
Request was handled without error 301
Moved
Provide alternate URL 404
Not found
Server couldn’t find the file
- Response headers:
<header name>: <header data>
- Provide additional information about response
Content-Type:
MIME type of content in response bodyContent-Length:
Length of content in response body
4.4 Proxies
- A proxy is an intermediary between a client and an origin server
- To the client, the proxy acts like a server
- To the server, the proxy acts like a client
- Can perform useful functions as requests and responses pass by
- Examples: Caching, logging, anonymization, filtering, transcoding
Footnotes:
- The original Internet Protocol, with its 32-bit addresses, is known as Internet Protocol Version 4 (IPv4)
- 1996: Internet Engineering Task Force (IETF) introduced Internet Protocol Version 6 (IPv6) with 128-bit addresses
- Intended as the successor to IPv4
- Majority of Internet traffic still carried by IPv4 (https://www.google.com/intl/en/ipv6/statistics.html)
- We will focus on IPv4, but will show you how to write networking code that is protocol-independent.
open_clientfd
and open_listenfd
use the collection of sockets system calls to set up a client or a listening socket. We don't have time to go through this interface in detail—the specific calls are shown in the diagram below, and the textbook has more information.