Linked Lists II
Table of Contents
1 Reading
After reviewing the material below, play around with the singly-linked list and doubly-linked list visualizations at https://visualgo.net/en/list. (Stacks, Queues, and Deques coming next week!)
2 List
Interface
We've done a lot of directly manipulating linked list nodes to get our heads around a structure consisting of a chain of references.
But to be a useful general-purpose data structure, our linked list is going to better encapsulation—ideally a single object with a bunch of useful methods like ArrayList
.
Let's start out by looking at the method signatures for some of the operations we'll want our linked list to support.
// post: returns number of elements in list public int size(); // post: returns true iff list has no elements public boolean isEmpty(); // post: value is added to beginning of list public void addFirst(E value); // post: value is added to end of list public void addLast(E value); // pre: list is not empty // post: returns first value in list public E getFirst(); // pre: list is not empty // post: returns last value in list public E getLast(); // pre: list is not empty // post: removes first value from list public E removeFirst(); // pre: list is not empty // post: removes last value from list public E removeLast(); // pre: 0 <= i < size() // post: returns object found at that location public E get(int i); // pre: 0 <= i <= size() // post: adds ith entry of list to value o public void add(int i, E o); // pre: 0 <= i < size() // post: removes and returns object found at that location public E remove(int i);
You may notice many of these methods are similar (or identical) to ArrayList
methods.
This is no accident!
Extensible arrays and linked lists are two approaches to providing a list-like data structure.
They have different properties, and are thus appropriate for different situations.
One important difference in the methods is the presence of methods explicitly for accessing/modifying the beginning and end (head and tail) of the list (e.g., getFirst
, removeFirst
, addLast
).
Such operations are so commonly used on linked lists that it makes sense to provide these versions (as opposed to doing everything with plain add
and remove
).
3 Singly-Linked List
We'll begin our LinkedList
implementation by deciding what fields it should have.
As we saw last time, with a reference to the first node in a chain, we can use the next
fields to work our way to any subsequent node.
So our data structure will keep track of the front (often call the head) of the list:
public class LinkedList<E> { private ListNode<E> head; }
Note that I'm using Java generics to allow our linked list to be declared to store different types of data.
The ListNode
class is familiar to us from the last video, but we need to define it somewhere our linked list can access it.
Since a ListNode
is an internal part of a LinkedList
object, and not something intended to be used on its own, I'm going to make it an inner class.
Java actually lets us define classes inside of other classes for exactly this purpose of defining object types we need for a particular implementation.
public class LinkedList<E> { private ListNode<E> head; private class ListNode<E> { private E data; private ListNode<E> next; ListNode(E data, ListNode<E> next) { this.data = data; this.next = next; } } }
I declare the inner ListNode
class to be private
because I want it to only be available inside the definition of LinkedList
.
Now on the constructor, which should initialize an empty list.
An empty list is one with no nodes in it, and Java uses the special value null
to indicate when an object variable doesn't refer to any actual object.
Thus, our constructor would look like
public LinkedList() { head = null; }
This naturally suggests an implementation for the isEmpty
method:
public boolean isEmpty() { return head == null; }
Before getting to methods that deal with adding and removing nodes, let's look at our size
method.
It's task is to count the number of nodes in the list and return that number.
Last time we looked at how we could use a loop to traverse our chain of nodes, and we can use this same technique to count them.
public int size() { ListNode<E> current = head; int count = 0; while (current != null) { count++; current = current.next; } return count; }
While this will work, from a performance perspective it's unsatifying. Every time our list is asked for its size, it will loop through all the nodes—an operation that is linear in the size of the list (if there are \(n\) nodes, our loop will take a number of operations proportional to \(n\)). A better design would be to keep track of the number of nodes as we add and remove them, so we don't have to re-count every time.
public class LinkedList<E> { private ListNode<E> head; private int count; private class ListNode<E> { private E data; private ListNode<E> next; ListNode(E data, ListNode<E> next) { this.data = data; this.next = next; } } public LinkedList() { head = null; count = 0; } public boolean isEmpty() { return head == null; } public int size() { return count; } }
This is an example of trading off space for time.
By adding a count
field, our LinkedList
object takes up slightly more memory.
In return, our size
method goes from linear time to constant time, which is a big improvement.
These kind of tradeoffs are very common in data structure design (and software design in general).
3.1 Operations at the head
Alright, now we're ready to tackle addFirst
and removeFirst
.
public void addFirst(E data) { // construct a new node to be the head of the list // this constructor will set the next field of the new node // to refer to the current head ListNode<E> newHead = new ListNode(data, head); // update the list head = newHead; count++; } public E removeFirst() { if (isEmpty()) { // if the user attempts to remove from an empty list, throw an exception // with an appropriate error message throw new NoSuchElementException("Cannot remove from an empty list"); } // save the head's data in a temporary variable, so we can return it later E data = head.data; // update the list head = head.next; count--; // return the data from the removed node return data; }
To help visualize what these methods are doing, let's return to our trusty boxes and arrows.
If we have an empty list and add two nodes via addFirst
, what will it look like?
LinkedList<String> list = new LinkedList<>(); list.addFirst("hi"); list.addFirst("go");
+-------+------+ | count | head | list | 0 | null | +-------+------+ list.addFirst("hi") newNode +-------+------+ +------+------+ | count | head | | data | next | list | 0 | null | | "hi" | null | +-------+------+ +------+------+ +-------+------+ +------+------+ | count | head | | data | next | list | 1 | +--+---> | "hi" | null | +-------+------+ +------+------+ list.addFirst("go"); +-------------------------------+ | | | | | newNode V +-------+------+ | +------+------+ +------+------+ | count | head | | | data | next | | data | next | list | 1 | +--+--+ | "go" | +--+---> | "hi" | null | +-------+------+ +------+------+ +------+------+ newNode +-------+------+ +------+------+ +------+------+ | count | head | | data | next | | data | next | list | 2 | +--+---> | "go" | +--+---> | "hi" | null | +-------+------+ +------+------+ +------+------+
How about then calling removeFirst
on this list?
String val = list.removeFirst();
+-------+------+ +------+------+ +------+------+ | count | head | | data | next | | data | next | list | 2 | +--+---> | "go" | +--+---> | "hi" | null | +-------+------+ +------+------+ +------+------+ +-------+------+ +------+------+ +------+------+ | count | head | | data | next | | data | next | list | 2 | +--+---> | "go" | +--+---> | "hi" | null | +-------+------+ +------+------+ +------+------+ data "go" +-------------------------------+ | | | V +-------+------+ | +------+------+ +------+------+ | count | head | | | data | next | | data | next | list | 1 | +--+--+ | "go" | +--+---> | "hi" | null | +-------+------+ +------+------+ +------+------+ data "go"
The String
stored in data
will be returned.
Java will automatically clean-up the removed node now that it is no longer in use.
It's possible to tell just by looking at the code for addFirst
and removeFirst
that these are constant time operations.
The same number of steps are involved regardless of how many nodes are currently in the list.
3.2 Operations at the tail
What about adding and removing from the other end of the list?
public void addLast(E data) { // if the list is empty, addLast is equivalent to addFirst // since the loop after this if-statement assumes the list is not empty, // we'll use addFirst which can handle an empty list if (isEmpty()) { addFirst(data); return; } // construct the new node ListNode<E> newTail = new ListNode(data, null); // find the current tail node ListNode<E> current = head; while (current.next != null) { current = current.next; } // update the list current.next = newTail; count++; } public E removeLast() { // if the list is empty or has only one element, removeLast is equivalent to removeFirst if (size() <= 1) { return removeFirst(); } // find the node right before the current tail node ListNode<E> current = head; while (current.next.next != null) { current = current.next; } // update the list current.next = null; count--; }
Unfortunately, manipulating the end of the list (called the tail) isn't as easy or efficient as manipulating the head.
Since we only have a reference to the head, each of these methods has to use a loop to traverse the list until it gets to the current tail node (we can distinguish the tail node because the last node in the list always has a next
value of null
).
Traversing the list is a linear time operation because we have to go around the loop once for each node currently in the list.
That is, if there are \(n\) nodes in our list, we have to move current
to the next node \(n-1\) times in order to reach the tail node.
Ideally, we want our tail-related methods to be constant time like our head-related methods.
We'll use the same strategy we used to improve the size
method from linear time to constant time—trading off space for time.
4 Upgrade: Tail Reference
The slow part of the addLast
and removeLast
implementations above is the loop to find the tail node.
What if we could just always keep track of which node is the tail node like we do with the head node?
Then we could do away with the loop and achieve a constant time addLast
and removeLast
.
First, let's augment out implementation with a field tail
that keeps track of the tail node:
public class LinkedList<E> { private ListNode<E> head; private ListNode<E> tail; private int count; private class ListNode<E> { private E data; private ListNode<E> next; ListNode(E data, ListNode<E> next) { this.data = data; this.next = next; } } public LinkedList() { head = null; tail = null; count = 0; } // ... }
Seems simple enough, but how do we use it?
Here's our addLast
now that we have a tail
field:
public void addLast(E data) { ListNode<E> newTail = new ListNode(data, null); // update the list tail.next = newTail; tail = newTail; count++; }
This is definitely an upgrade from the old addLast
!
It's much simpler and now a constant time operation.
What about removeLast
?
Here we run into a problem.
Removing the tail node requires setting the next
field of the previous node to null
.
But we have no way to get a reference to that previous node other than traversing the list from the head.
That is, the tail node doesn't keep track of the previous node in the list.
In order to remove from the end of the list efficiently, we will need to modify our list design.
5 Upgrade: Doubly-Linked List
Here we continue the tradeoff of space and time. To make removing from the tail efficient in term of time, we will have each node maintain a reference to the previous node in the list.
private class ListNode<E> { private ListNode<E> prev; private E data; private ListNode<E> next; ListNode(E data, ListNode<E> next, ListNode<E> prev) { this.prev = prev; this.data = data; this.next = next; } }
This will make each node take up more space, but allows us to implement an efficient removeLast
method:
public E removeLast() { if (size() <= 1) { return removeFirst(); } tail = tail.previous; tail.next = null; count--; }
We call this form of linked list a doubly-linked list due to the fact that each node has two links: one to the next node and one to the preivous node. When our nodes only have a link to the next node, we say we have a singly-linked list.
6 ArrayList
vs Singly-Linked List vs Doubly-Linked List
Now that we have these three different structures for representing a list, how would we go about choosing which one to use?
It will depend on which list operations you need to be efficient—there's no one choice that's always best.
Need to frequently remove from the front of the list? A either type of linked list can do this in constant time, so you would use one of those.
Also removing from the end of the list? That narrows it down to a doubly-linked list.
On the other hand, if you need to frequently access elements in the middle of the list, an ArrayList
will do that far more efficiently (i.e., constant time).
Here's a chart comparing these three structures across a variety of operations:
Operation | ArrayList |
Singly-Linked List | Doubly-Linked List |
---|---|---|---|
size |
constant time | constant time | constant time |
addLast |
constant time or linear time (if resize) | linear time | constant time |
removeLast |
constant time | linear time | constant time |
getLast |
constant time | linear time | constant time |
addFirst |
linear time | constant time | constant time |
removeFirst |
linear time | constant time | constant time |
getFirst |
constant time | constant time | constant time |
get(index) |
constant time | linear time | linear time |
set(index) |
constant time | linear time | linear time |
remove(index) |
linear time | linear time | linear time |
contains |
linear time | linear time | linear time |
remove(element) |
linear time | linear time | linear time |
7 Practice Problems1
- Intuitively, we might think that
addFirst
andremoveFirst
only modify thehead
field of the list, whileaddLast
andremoveLast
only modify thetail
field. However, there are situations where each of these operations must modify bothhead
andtail
. (Such special cases are often called edge cases or boundary cases). When would each of these operations need to modify both? - The chart above shows that both the
ArrayList
and linked lists take linear time to remove an element from a specific position in the list (remove(index)
). Why is this? Is it the same reason for both data structures or different reasons? - When you add or remove the element found at index
i
of a singly-linked list, you must create a temporarycurrent
node reference and advance it through the list. At which index's node should the loop stop, relative toi
? - In an
ArrayList
, it is possible to overrun the capacity of the array, at which point the list must be resized to fit. Is resizing necessary on a linked list? What limits the number of elements that a linked list can have? - Write an implementation of the
public E remove(int index)
method for a doubly-linked list. - Write an implementation of the
public boolean contains(E element)
method for a doubly-linked list.
Footnotes:
Solutions:
- When
addFirst
oraddLast
is called on an empty list, bothhead
andtail
need to be updated to refer to the new node (otherwise one would still benull
). WhenremoveFirst
orremoveLast
are called on a list with one node, bothhead
andtail
need to be set tonull
(otherwise one of them would still refer to the removed node). - For
ArrayList
,remove
takes linear time due to shifting elements over to fill in the slot in the array used by the removed element. For a linked list it takes linear time to find the node at a specific index. - The loop should stop and index
i - 1
, the index before the one to add or remove. This is so that you can adjust the preceding node's next reference. - Resizing is not necessary for a linked list, since more nodes can be dynamically allocated. The only limit on the number of elements is the amount of memory available to the Java virtual machine.
public E remove(int index) { if (index >= size()) { // generate an error if there is no element at index throw new NoSuchElementException(); } // find the node at index ListNode<E> current = head; currentIndex = 0; while (currentIndex != index) { current = current.next; currentIndex++; } // remove the node, accounting for edge cases where node is at the beginning or end of the list if (current.prev) { // update previous node's next to skip over this node (if prev isn't null) current.prev.next = current.next; } if (current.next) { // update next node's prev to skip over this node (if next isn't null) current.next.prev = current.prev } elementCount--; return current.data; }
public boolean contains(E element) { // traverse the list, comparing element with the data at each node ListNode<E> current = head; // head will be null for an empty list, so we don't need a special case while (current != null) { if (current.data.equals(element)) { return true; } current = current.next; } // if we exit the loop, we've compared to each node and none were equal // so element is not in the list return false; }