Lab 1: System calls

Aaron Bauer

January 2, 2022

Lab 1: System Calls

Important Deadlines

Lab due 1/17/2022 (Monday) at 9:00pm.

Introduction

This lab adds system calls to osv for interacting with the file system. Your task is to implement the system calls listed below. Unimplemented system calls will panic if they are called, as you implement each system call, remove the panic. For this lab you do not need to worry about synchronization. There will only be one process.

Logistics

If you are working with a partner, you’ll want to both work off the same git repository. An easy way to set this up is to create a private GitHub repo and point your local repo at it. This can be a convenient way to back up your work even if you’re working alone (and a convenient way to submit on Gradescope).

A git repository can have one or more remotes. These are other copies of the repository that one can pull changes from and push changes to. Having initially cloned the osv repo from mantis in Lab 0, the mantis repo will be named the origin remote. You can see this by running (from the osv-w22 directory)

$ git remote -v
origin  ssh://awb@mantis.mathcs.carleton.edu/web-pages/www.cs.carleton.edu/faculty/awb/cs332/osv-w22 (fetch)
origin  ssh://awb@mantis.mathcs.carleton.edu/web-pages/www.cs.carleton.edu/faculty/awb/cs332/osv-w22 (push)

By default, the git pull and git push commands interact with the origin repo. Thus, the first step is to rename the origin remote to upstream:

$ git remote rename origin upstream

Now you will get any updates I make to the original repo by running (do this now)

$ git pull upstream master

At this point, you’re ready to put your repo on GitHub.

Go to GitHub.com and sign in (create an account if you don’t have one—it’s free)
On the left-hand side, you should see a green button to create a new repository
Give it a name, set it to Private, and click Create repository
At the top, under Quick setup, toggle to SSH (GitHub has deprecated HTTPS for some tasks)
Then run the three commands under …or push an existing repository from the command line.
1. git remote add origin ... adds the GitHub repo as a new remote named origin for your local repository (copy-paste the line with your actualy GitHub URL)
2. git branch -M main renames the current branch (which is master by default) to GitHub’s default name of main
3. git push -u origin main sends the code from the local repo to the GitHub repo (i.e., pushes the contents of the main branch to the origin remote)
  - For this step, you’ll need to have set up an SSH key on GitHub
Add your partner (if applicable) as a collaborator on the GitHub repo by going to Settings>Manage Access on the GitHub repo page
Your partner should then clone the GitHub repo (via SSH, for which they will also need an SSH key on GitHub)

For collaborating on code, VS Code Live Share is pretty useful.

Background

osv has to support a list of system calls. Here is a list of system calls that are already implemented.

Process system calls:

int spawn(const char *args)
- creates a new process
int getpid()
- returns pid of process

File system system calls:

int link(const char *oldpath, const char *newpath)
- create a hard link for a file
int unlink(const char *pathname)
- remove a hard link.
int mkdir(const char *pathname)
- create a directory
int chdir(const char *path)
- change the current workig directory
int rmdir(const char *pathname)
- remove a directory

Utility system calls:

void meminfo()
- print information about the current process’s address space
void info(struct sys_info *info)
- report system info

You can also refer to the osv overview for a general description of the components of the kernel. You won’t need any of this information for this lab, but it may a useful reference going forward.

Trap

osv uses software interrupts to implement system calls. When a user application needs to invoke a system call, it issues an interrupt with instruction int 0x40. System call numbers are defined in include/lib/syscall-num.h. When the int instruction is being issued, the user program is responsible to set the register %rax to be the chosen system call number.

The software interrupt is captured by the registered trap vector (arch/x86_64/kernel/vectors.S) and the handler in arch/x86_64/kernel/vectors.S will run. The handler will reach the trap function in arch/x86_64/kernel/trap.c and the trap function to route the interrupt to syscall function implemented in kernel/syscall.c. syscall then routes the call to the respective handler in kernel/syscall.c.

File, file descriptor and inode

The kernel needs to keep track of the open files so it can read, write, and eventually close the files. A file descriptor is an integer that represents this open file. Somewhere in the kernel you will need to keep track of these open files. Remember that file descriptors must be reusable between processes. File descriptor 4 in one process should be able to be different than file descriptor 4 in another (although they could reference the same open file).

Traditionally the file descriptor is an index into an array of open files.

The console is simply a file (file descriptor) from the user application’s point of view. Reading from keyboard and writing to screen is done through the kernel file system call interface. Currently reading from and writing to the console is implemented as hard coded number, but as you implement file descriptors, you should use stdin and stdout file structs as backing files for console reserved file descriptors (0 and 1).

Implementation

I have provided you with a lab 1 design document (raw markdown) to guide you through your implementation. This will also serve as an example when you write your own design docs for future labs.

Hints:

File descriptors are just integers.
Look at already implemented system calls to see how to parse the arguments. (kernel/syscall.c:sys_read)
If a new file descriptor is allocated, it must be saved in the process’s file descriptor tables. Similarly, if a file descriptor is released, this must be reflected in the file descriptor table.
A full file descriptor table is a user error (return an error value instead of calling panic).
A complete file system is already implemented. You can use fs_read_file/fs_write_file to read/write from a file. You can use fs_open_file to open a file. If you decide to have multiple file descriptors referring to a single file struct, make sure to call fs_reopen_file() on the file each time. You can find information about a file in the file struct and the inode struct inside of the file struct.
For this lab, the reference solution makes changes to kernel/syscall.c, kernel/proc.c and include/kernel/proc.h.

What To Implement

File Descriptor Opening

/*
 * Corresponds to int open(const char *pathname, int flags, int mode); 
 * 
 * pathname: path to the file
 * flags: access mode of the file
 * mode: file permission mode if flags contains FS_CREAT
 * 
 * Open the file specified by pathname. Argument flags must include exactly one
 * of the following access modes:
 *   FS_RDONLY - Read-only mode
 *   FS_WRONLY - Write-only mode
 *   FS_RDWR - Read-write mode
 * flags can additionally include FS_CREAT. If FS_CREAT is included, a new file
 * is created with the specified permission (mode) if it does not exist yet.
 * 
 * Each open file maintains a current position, initially zero.
 *
 * Return:
 * Non-negative file descriptor on success.
 * The file descriptor returned by a successful call will be the lowest-numbered
 * file descriptor not currently open for the process.
 * 
 * ERR_FAULT - Address of pathname is invalid.
 * ERR_INVAL - flags has invalid value.
 * ERR_NOTEXIST - File specified by pathname does not exist, and FS_CREAT is not
 *                specified in flags.
 * ERR_NOTEXIST - A directory component in pathname does not exist.
 * ERR_NORES - Failed to allocate inode in directory (FS_CREAT is specified)
 * ERR_FTYPE - A component used as a directory in pathname is not a directory.
 * ERR_NOMEM - Failed to allocate memory.
 */
sysret_t
sys_open(void *arg);

File Descriptor Reading

/*
 * Corresponds to ssize_t read(int fd, void *buf, size_t count);
 * 
 * fd: file descriptor of a file
 * buf: buffer to write read bytes to
 * count: number of bytes to read
 * 
 * Read from a file descriptor. Reads up to count bytes from the current position of the file descriptor 
 * fd and places those bytes into buf. The current position of the file descriptor is updated by number of bytes read.
 * 
 * If there are insufficient available bytes to complete the request,
 * reads as many as possible before returning with that number of bytes. 
 * Fewer than count bytes can be read in various conditions:
 * If the current position + count is beyond the end of the file.
 * If this is a pipe or console device and fewer than count bytes are available 
 * If this is a pipe and the other end of the pipe has been closed.
 *
 * Return:
 * On success, the number of bytes read (non-negative). The file position is
 * advanced by this number.
 * ERR_FAULT - Address of buf is invalid.
 * ERR_INVAL - fd isn't a valid open file descriptor.
 */
sysret_t
sys_read(void *arg);

Close a File

/*
 * Corresponds to int close(int fd);
 * 
 * fd: file descriptor of a file
 * 
 * Close the given file descriptor.
 *
 * Return:
 * ERR_OK - File successfully closed.
 * ERR_INVAL - fd isn't a valid open file descriptor.
 */
sysret_t
sys_close(void *arg);

File Descriptor Writing

/*
 * Corresponds to ssize_t write(int fd, const void *buf, size_t count);
 * 
 * fd: file descriptor of a file
 * buf: buffer of bytes to write to the given fd
 * count: number of bytes to write
 * 
 * Write to a file descriptor. Writes up to count bytes from buf to the current position of 
 * the file descriptor. The current position of the file descriptor is updated by that number of bytes.
 * 
 * If the full write cannot be completed, writes as many as possible before returning with 
 * that number of bytes. For example, if the disk runs out of space.
 *
 * Return:
 * On success, the number of bytes (non-negative) written. The file position is
 * advanced by this number.
 * ERR_FAULT - Address of buf is invalid;
 * ERR_INVAL - fd isn't a valid open file descriptor.
 * ERR_END - if fd refers to a pipe with no open read
 */
sysret_t
sys_write(void *arg);

Reading a Directory

/*
 * Corresponds to int readdir(int fd, struct dirent *dirent);
 * 
 * fd: file descriptor of a directory
 * dirent: struct direct pointer
 * 
 * Populate the struct dirent pointer with the next entry in a directory. 
 * The current position of the file descriptor is updated to the next entry.
 * Only fds corresponding to directories are valid for readdir.
 *
 * Return:
 * ERR_OK - A directory entry is successfully read into dirent.
 * ERR_FAULT - Address of dirent is invalid.
 * ERR_INVAL - fd isn't a valid open file descriptor.
 * ERR_FTYPE - fd does not point to a directory.
 * ERR_NOMEM - Failed to allocate memory.
 * ERR_END - End of the directory is reached.
 */
sysret_t
sys_readdir(void *arg);

Duplicate a File Descriptor

/*
 * Corresponds to int dup(int fd);
 * 
 * fd: file descriptor of a file
 * 
 * Duplicate the file descriptor fd, must use the smallest unused file descriptor.
 * Reading/writing from a dupped fd should advance the file position of the original fd
 * and vice versa. 
 *
 * Return:
 * Non-negative file descriptor on success
 * ERR_INVAL if fd is invalid
 * ERR_NOMEM if no available new file descriptor
 */
sysret_t
sys_dup(void *arg);

File Stat

/*
 * Corresponds to int fstat(int fd, struct stat *stat);
 * 
 * fd: file descriptor of a file
 * stat: struct stat pointer
 *
 * Populate the struct stat pointer passed in to the function.
 * Console (stdin, stdout) and all console dupped fds are not valid fds for fstat. 
 * Only real files in the file system are valid for fstat.
 *
 * Return:
 * ERR_OK - File status is written in stat.
 * ERR_FAULT - Address of stat is invalid.
 * ERR_INVAL - fd isn't a valid open file descriptor or refers to non file. 
 */
sysret_t
sys_fstat(void *arg);

Testing

After you implement each of the system calls described above. You can go through user/lab1/* files and run individual test in the osv shell program by typing close-test or open-bad-args and so on. To run all tests in lab1, run python3 test.py 1 in the osv directory (from your normal shell, not osv). The script relies on python >=3.6. For each test passed, you should see a passed <testname> message. At the end of the test it will display a score for the test run.

What to turn in

You will submit your work for this project via Gradescope.

Gradescope lets you submit via GitHub, which is probably the easiest method. All you’ll need to do is connect your GitHub account (the Gradescope submission page has a button for this) and select the repository and branch you wish to submit. Alternatively, you can create a zip file of the osv-w22 directory and upload that. The the arch, include and kernel directories from your submission will be used.

When you submit, the autograder will compile your code and run the test cases.

Although you are allowed submit your answers as many times as you like, you should not treat Gradescope as your only debugging tool. Many people may submit their projects near the deadline, and thus it will Gradescope take longer to process the requests. You may not get feedback in a timely manner to help you debug problems.

Grading

This lab will be graded out of 90 points, as shown in the table below. Comments explaining your approach can help earn partial credit if there are tests that don’t pass. Poor coding style can lose points, so make sure to submit clean, well-organized code.

Test	Points
`close-test`	10
`dup-console`	7
`dup-read`	7
`fd-limit`	5
`fstat-test`	7
`open-bad-args`	10
`open-twice`	10
`read-bad-args`	10
`read-small`	10
`readdir-test`	7
`write-bad-args`	7