Iterators
Table of Contents
Adpated from Lenhart and Jannen: https://williams-cs.github.io/cs136-f20-www/slides/Iterators.pdf
1 Video
Here is a video lecture for the material outlined below. If you prefer (or want supplemental) textbook readings, Bailey Chapter 8 covers this topic.
The video contains sections on
- Motivation (0:01)
- Analyzing a quadratic time operation (4:38)
- Iterator interface (8:50)
- Fibonacci Iterator (10:21)
- Iterating over structures (16:02)
- Inner classes (19:08)
- SinglyLinkedListIterator (22:06)
- SkipIterator (26:13)
- ReverseIterator (28:25)
- Iterators and for-each loops (29:46)
The Panopto viewer has table of contents entries for these sections: https://carleton.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=7bb540f7-0278-4c5b-b811-acb60122bfd2
2 Motivation
- The problem: Efficient and uniform dispensing of values from data structures
- The solution: The
Iteratorinterface- Iterators as dispensers
- Iterators as generators
- Iterators as filters
- Iterators that iterate over other Iterators?!
- Yep, it's a thing
- Iterators and
forloops: TheIterableinterface- Allows use of iterators with for-each
Here's a method (count) that counts the number of times a particular Object appears in a List.
List is an interface provided by Java that specifies methods for a list-like structure.
Both ArrayList and LinkedList implement this interface.
public static int count(List<Object> data, Object elem) { int count = 0; for (int i = 0; i < data.size(); i++) { Object obj = data.get(i); if (obj.equals(elem)) // <-- Curly brackets can be omitted from an if, for, or while count++; // <-- when the body consists of a single line } return count; }
(All Objects in Java have a default equals method (or define their own).)
Does this count on all structures (that we have studied so far)?
2.1 Problems with count
get(i)not defined on stacks and queues (socountwill not work at all)get(i)is slow on some structures- Takes linear time on a singly- or doubly-linked list, since we have to loop through the nodes starting at the head (or tail)
- What does this mean for the performance of
counton linked lists?- Let's count up the steps:
get(0)visits 1 nodeget(1)visits 2 nodesget(2)visits 3 nodes- …
get(data.size() - 1)visits \(n\) nodes- \(1 + 2 + 3 + \cdots + n = (n + 1)\frac{n}{2} = \frac{n^2 + n}{2}\)
- Since it takes a number of steps proportional to \(n^2\), we say it take quadratic or polynomial time
- This is significantly worse than linear time as \(n\) gets really big
- Intuitively, we have a loop over each element (a linear time operation) that contains within it a linear time operation (
get)- linear \(\times\) linear = quadratic
- How do we traverse data in structures in a general, efficient way?
- Goal: data structure-specific for efficiency
- Goal: use same interface to make general
3 Iterator Interface
Every data structure we've studies has provided some form of these common operations
size()isEmpty()add()remove()
They also all provide a method for efficient data traversal: public Iterator<E> iterator().
Iterator<E> is an interface for classes that provide support for efficiently visiting all elements of a data structure.
The common methods it defines can serve to dispense values for
- Traversal of elements: Iteration
- Production of values: Generation
- Selection of values: Filtering
By providing a simple, minimal interface, the Iterator interface abstracts away details of how to access elements.
This must be part of the customized implementation for each data structure.
public interface Iterator<E> { boolean hasNext(); // are there more elements in iteration? E next(); // return next element }
4 Fibonacci Iterator
Here's a simple example of an iterator that generates values belonging to the sequence of numbers known as the Fibonacci Sequence. This sequence begins with two 1s, and then each subsequent number is the sum of the previous two numbers in the sequence. Thus, we have 1, 1, 2, 3, 5, …
Here's an iterator that effcienctly generates this sequence
import java.util.Iterator; public class FibonacciNumbers implements Iterator<Integer> { private int next; private int current; private int length; public FibonacciNumbers(int n) { length = n; next = 1; current = 1; } public boolean hasNext() { return length >= 0; } public Integer next() { length--; int temp = current; current = next; next = temp + current; return temp; } }
Why Is This Cool? (it is)
- We could calculate the \(i\)th Fibonacci number each time, but that would be slow
- Observation: to find the \(n\)th Fibonacci number, we need to calculate the previous \(n-1\) Fib numbers…
- But by storing some state, we can easily generate the next Fibonacci number in constant time
- Knowledge about the structure of the problem helps us traverse the Fibonacci space efficiently one element at a time
- We can do the same for data structures
5 Iterating Over Structures
Goal: Have a data structure produce an iterator that return the values of the structure in some order. How can we accomplish this?
- Define an iterator class for the structure, for example
class ArrayListIterator<E> implements Iterator<E> { ... } class SinglyLinkedListIterator<E> implements Iterator<E> { ... }
- Provide a method in the data structure that returns an iterator
public Iterator<E> iterator() { ... }
The details of hasNext() and next() often depend on the specific data structure.
For example, our SinglyLinkedListIterator will hold a reference to the next node whose value it will return.
In contrast our ArrayListIterator will hold the index of next element to return.
Given these two pieces, we rewrite count to use an iterator instead of repeated calls to get:
public int count (List<Object> data, Object elem) { int count = 0; Iterator<Object> iter = data.iterator(); while (iter.hasNext()) { if(elem.equals(iter.next())) count++; } return count; }
Why do we need an iterator method in the data structure that returns an iterator?
Why not just pass the data structure to the constructor for the iterator like this?
public SinglyLinkedListIterator<E>(SinglyLinkedList<E> list) { // code to construct the iterator }
In this case we would only have access to list's public methods.
We would be prevented from using the private instance variables of the data structure to efficiently implement our iterator.
From within the data structure, however, we can access the instance variables of the structure and use them in the implementation of the iterator.
6 Inner Classes
One way we can give iterators the privileged access to a data structure's internal data we need is to implement them as an inner class of the data structure.
In Java, we can nest one class definition within another.
We have used this already in our linked list implementations—our ListNode has been an inner class of the SinglyLinkedList class:
public class SinglyLinkedList<E> { private class ListNode { E data; ListNode next; } ... }
We could make ListNode an entirely separate class implemented in its own .java file, but there are some good reason to make it an inner class:
- It simplifies the
SinglyLinkedListimplementation somewhat if it can directly manipulate the fields of the nodes. IfListNodewas a separate class withprivatefields, we'd need to define getters and setters forSinglyLinkedListto use. - Logically, a
ListNodeisn't an object that should exist on its own. AListNodeshould only exist as part of aSinglyLinkedList. By making it aprivateinner class, we ensure it's something that can only be used internally bySinglyLinkedList. - In general, anytime it would be useful to have a new type of object as part of a data structure's implementation, we'll create an inner class for that purpose.
This same idea applies to iterators!
If we define our SinglyLinkedListIterator as an inner class within the SinglyLinkedList class, we can use the head field and directly access the next and data fields of the ListNode inner class:
import java.util.Iterator; public class SinglyLinkedList<E> { private class ListNode { E data; ListNode next; } private ListNode head; private int count; ... private class SinglyLinkedListIterator<E> implements Iterator<E> { private ListNode current; public SinglyLinkedListIterator() { current = head; } public boolean hasNext() { return current != null; } public E next() { E value = current.data; current = current.next; return value; } } public Iterator<E> iterator() { return new SingleLinkedListIterator(); } }
Because this iterator keeps track of the current node it's on, it can return the value at the next node in constant time.
This is a big improvement over repeated calls to the linear time get method (where we have to re-find the current node each time).
7 More Examples
We can also make iterators that filter the output of other iterators.
SkipIterator: Skips over a given valueReverseIterator: Dispenses elements in the reverse order given by another iterator
7.1 SkipIterator
- Problem: How can we filter out unwanted elements from an iterator
iter? - Solution: Create another iterator that takes
iteras a parameter to its constructor and uses that the methods ofiter(with some extra steps)- The
SkipIteratorwill ensure that the next element thatiterwould dispense is not the one we want to skip over! - Since
SkipIteratoronly needs access to thepublicmethods of another iterator (and not any class' internal data), we can implement it as its own separate class
- The
/** * An iterator that returns the values from another iterator, skipping over a specific value * Note: this implementation doesn't handle the edge case where a * skipped value is the last value in the original iterator */ import java.util.Iterator; public class SkipIterator<E> implements Iterator<E> { private Iterator<E> elems; private E skipValue; public SkipIterator(Iterator<E> iter, E skipMe) { elems = iter; skipValue = skipMe; } public boolean hasNext() { return elems.hasNext() } public E next() { E returnVal = elems.next(); while (returnVal.equals(skipValue) && elems.hasNext()) { returnVal = elems.next(); } return returnVal; } }
7.2 ReverseIterator
- Problem: How can dispense the elements from an iterator
iterin the opposite order from whichiterwould dispense them? - Solution: Create another iterator that
- Creates a
LinkedListcalledsecretList - Fills it with the elements dispensed by
iter - But stores them in reverse order
- Then asks
secretListfor an iterator to itself to use for dispensing values
- Creates a
/** * An iterator that reverses the order of elements returned from another iterator. */ import java.util.LinkedList; import java.util.Iterator; public class ReverseIterator<E> implements Iterator<E> { private Iterator<E> elems; public ReverseIterator(Iterator<E> iter) { LinkedList<E> list = new LinkedList<>(); while (iter.hasNext()) { // adding each element to front serves to reverse them---like a stack! list.addFirst(iter.next()); } elems = list.iterator(); } // All other methods dispatch to the underlying iterator. public boolean hasNext() { return elems.hasNext(); } public E next() { return elems.next(); } }
8 Iterators and for-each
Recall that with arrays, we saw we can use a simplified form of the for loop (called a for-each loop):
String[] arr = new String[] {"All", "the", "world's", "a", "stage"}; for( String s : arr) { System.out.println(s); }
We could also rewrite out count method from before to use a for-each loop:
public int count (List<Object> data, Object elem) { int count = 0; for (Object obj : data) { if(elem.equals(obj)) count++; } return count; }
Why does this work??
Because the List interface provides an iterator() method and we can use the for-each construct…
for( E elem : boxOfStuff ) { ... }
…as long as boxOfStuff implements the Iterable interface:
public interface Iterable<E> { public Iterator<E> iterator(); }
All of the Java structures we've studied so far (ArrayList, LinkedList, Stack, Queue, and Deque) implement Iterable, and so can all be used with a for-each loop.
9 Practice Problems1
- Suppose
iteris anIteratorover some data structure. Write a loop usingiterto print all the values of the data structure. - Suppose that
ais anArrayListofIntegervalues (ArrayList<Integer>). Write a loop that will use anIteratorto print thoseIntegervalues that are even. Try to see if you can write two ways: using awhileloop and using a for-each loop. - This code attemps to use an iterator to find and return the longest string in a linked list, but is has a bug. What is the problem and how would you fix it?
// returns the longest string in the list (does not work!) public static String longest(LinkedList<String> list) { Iterator<String> itr = list.iterator(); String longest = itr.next(); // initialize to first element while (itr.hasNext()) { if (itr.next().length() > longest.length()) { longest = itr.next(); } } return longest; }
- What is one benefit of making the list iterator into an inner class?
- Write a
CharacterIteratorclass that iterates over the characters in aString. It should implement theIterator<Character>interface, andnextshould return typeCharacter. The constructor should take theStringthat the instance will iterate over. It may be useful to refer to the Java String documentation.
10 Learning Block
- Lab 2
- Lots of great questions on the check-in forum, make sure to check it out!
- Adding dummy nodes does not change the efficiency of the list, but serves to greatly simplify some of the code
- By eliminating edge cases for many of the add and remove methods (see preparatory exercises for examples of these edge cases)
- Simpler code is easier to write correctly, easier to debug, easier to maintain/modify
- In what sense are the dummy nodes transparent?
- internally the list accounts for the dummy nodes, interacts with them, etc.
- the head and tail fields always refer to the dummy nodes, and they are always the first and last nodes in the list
- the effects of and values returned by public methods (i.e., the externally observable behavior of the list) are as if the dummy nodes don't exist
- internally the list accounts for the dummy nodes, interacts with them, etc.
- Method overloading
- Joe asked a great question on the forum: why are there three removes?
- private E remove(ListNode node)
- public E remove(int index)
- public E remove(E value)
- We want all three because each does a different thing
- But how does Java tell them apart?
- The parameter types are part of a method's identity—two methods with the same name, but different parameter types are treated as entirely separate and don't conflict
- When you call remove, you pass it a value. The type of this value tells Java which remove to use (the one that takes a value of that type)
- This is called method overloading.
- Joe asked a great question on the forum: why are there three removes?
- Remember to fill out the partner survey for lab 3!
- Questions?
- Form question
- In pods: implement WordIterator
- constructor takes a string
- next returns the next word in the string
- words are separated by spaces
- s.split(" ") returns a String[] of parts of s separated by spaces
- In pods: share lab progress
Footnotes:
Solutions:
while (iter.hasNext()) { System.out.println(iter.next()); }Iterator<Integer> iter = a.iterator(); while (iter.hasNext()) { int val = iter.next(); if (val % 2 == 0) { // check is the value is even System.out.println(val); } }
or
for (int val : a) { if (val % 2 == 0) { // check is the value is even System.out.println(val); } }
- The problem with the code is that its loop calls the
nextmethod on the iterator in two places: once when it tests the length of the string, and again when it tries to store the string as the longest. Each time you callnext, the iterator advances by one position, so if it’s called twice in the loop, you’ll skip an element when you find a match. For example, if the list contains the elements("oh", "hello", "how", "are", "you"), the program might see the "hello" and intend to store it, but the second call tonextwould actually cause it to store the element that follows "hello", namely, "how".The solution is to save the result of the itr.next() call into a variable. The following code would replace the
whileloop in the previous code:// this version of the code is correct while (itr.hasNext()) { String current = itr.next(); if (current.length() > longest.length()) { longest = current; } }
- When the iterator is an inner class, it can directly access the fields of the enclosing list object.
public class CharacterIterator implements Iterator<Character> { private String str; // String we are iterating over private int index; // index of the next char to return public CharacterIterator(String s) { str = s; index = 0; } public boolean hasNext() { return index < str.length(); } public Character next() { Character c = str.charAt(index); index++; return c; } }