Introduction to Trees

Table of Contents

Adapted in part from Lenhart and Jannen (https://williams-cs.github.io/cs136-f20-www/slides/treesIntro.pdf) and Reges (https://courses.cs.washington.edu/courses/cse143/20wi/notes/notes19.html)

1 Reading

Read Bailey sections 12.1, 12.2, and 12.3 (6 pages).

2 Learning Goals

After this lesson you will be able to

  1. Describe some of the many applications of trees
  2. Identify components of a tree structure using the formal terminology
  3. Define and construct a binary tree in Java
  4. Write a recursive binary tree method

3 What are trees?

3.1 Phylogenetic Tree

The genetic relationships between species can be represented in a "tree" structure.

species-tree.png

3.2 Organization Tree

Organziational hierarchy is a kind of tree.

org-tree.png

3.3 Family Tree

We often put family relationships into a tree.

family-tree.png

3.4 File System Tree

The files and folders on a computer can be thought of as a tree.

directory-tree.png

3.5 Game Tree

Like we've seen with decision trees, the choices when playing a game and the resulting game states naturally form a tree. A partial example for Tic-Tac-Toe is shown below (only three of the nine possible choices for X's first move are shown). Game-playing AI often involves searching over these game trees to find an advantageous move.

game-tree.png

3.6 Expression Tree

We can encode arithmetic expressions in a tree.

expression-tree.png

To read the above tree, we'd start at the bottom:

  • compute \(5+9\)
  • multiply the result by 2
  • add the result to 3

If we try and write it out just as it appears in the tree, we get something that's not quite right:

\(5 + 9 * 2 + 3\)

That's because the tree also defines an order of operations, which requires paranetheses when we write it out:

\(((5 + 9) * 2) + 3\)

3.7 The Larch

There are actual trees too, I suppose. They are nice, but not part of this course.

larch-tree.jpg

4 Tree Terminology

Like a linked list, a tree is made up of nodes that point to each other.

  • Typical tree terminology borrows the language of the family tree

    expression-tree.png

    • "+" is the parent node of 5 and 9
    • 5 and 9 are child nodes of "+"
    • 5 and 9 are sibling nodes
    • "*" is an ancestor of 5
    • 5, 9, "+", and 2 are all decendents of "*"
  • The root node is the node at the top of the tree with no parent
    • "+" in the expression tree example
  • Leaf nodes are nodes at the bottom with no children
    • 3, 2, 5, and 9 are all leaf nodes
  • Interior nodes are those in the middle, neither root nor leaves
    • "*" and the lower "+" are examples of interior nodes

There are also properties of nodes and trees we often care about:

  • The degree of a node is the number of children that node has
    • Our expression nodes either have degree 0 (the numeric leaf nodes) or 2 (the interior/root operator nodes)
  • The depth of node is the number of edges from the root to that node

    tree-depth.png

  • The height of a node is the number of edges on the longest path from the node to a leaf. A leaf node will have a height of 0.

    tree-height.png

  • The degree of a tree is the maximum degree across all nodes in that tree
  • The height of tree is the maximum depth across all nodes in that tree (equivalently, the height at the root)

5 Defining a Binary Tree

An expression tree is an example of a category of trees call binary trees. These are trees with degree at most 2, meaning each node has 0, 1, or 2 children. Like with linked lists, we will define a binary tree recursively. It is either:

  • an empty tree (with no nodes)
  • or a root node with left and right subtrees
       +-----------+
       | root node |
       +-----------+
           /   \
         /       \
       /           \
     /\             /\
    /  \           /  \
   /    \         /    \
  / left \       / right\
 / subtree\     / subtree\
+----------+   +----------+

This definition lends itself to all kinds of different trees. The simplest kind of tree is an empty tree, which can't really be drawn because it's empty. Once you have an empty tree, you can use the second part of the definition to make a new kind of tree that is composed of a root node with left and right subtrees that are both empty. In other words, that would be a single leaf node, which we could represent with a star:

*

Now that we have this as an option, we can use our recursive rule to say that a tree could be a root node with a left tree that is empty and a right tree that is a leaf:

*
 \
  *

Or we can have an empty right and a leaf to the left:

  *
 /
*

Or we could have leaf on either side:

  *
 / \
*   *

These now become possibilities to use for our recursive definition, allowing us to construct even more kinds of trees, as in:

     *                  *            *            *
    / \                / \          / \          / \
  *     *            *     *      *     *      *     *
 / \   / \          / \            \   /        \     \    
*   * *   *        *   *            * *          *     * 

Each of the *s represents a BinaryTreeNode, which we can draw as a box in the same way we drew linked list nodes:

+-----------+
|    data   |
|-----------|
| left|right|
+--+-----+--+
  /       \
 /         \

In Java, we'd define the class like this:

class BinaryTreeNode<E> {
    public E data;
    public BinaryTreeNode<E> left;
    public BinaryTreeNode<E> right;

    public BinaryTreeNode(E data) {
        this(data, null, null); // call the other, 3-argument constructor
    }

    public BinaryTreeNode(E data, BinaryTreeNode<E> left, BinaryTreeNode<E> right) {
        this.data = data;
        this.left = left;
        this.right = right;
    }
}

To construct our expression tree for \((5 + 9) * 2 + 3\), we could do:

BinaryTreeNode<String> five = new BinaryTreeNode<>("5");
BinaryTreeNode<String> nine = new BinaryTreeNode<>("9");
BinaryTreeNode<String> fivePlusNine = new BinaryTreeNode<>("+", five, nine);
BinaryTreeNode<String> fivePlusNineTimesTwo = new BinaryTreeNode<>("*", fivePlusNine, new BinaryTreeNode<String>("2"));
BinaryTreeNode<String> fivePlusNineTimesTwoPlusThree = new BinaryTreeNode<>("+", new BinaryTreeNode<String>("3"), fivePlusNineTimesTwo);

6 Binary Tree Methods

We could also write a method to compute the result (or evaluate) one of the expression trees:

int evaluate(BinaryTreeNode<String> expr) {
    if (height(expr) == 0) // <-- base case
        // when a node's height is 0, it has no children, and is therefore a leaf
        // in expression trees, leaves are the numbers
        return Integer.parseInt(expr.data);
    else { // <-- recursive case

        // recursively evalutate the left and right subtrees
        int left = evaluate(expr.left);
        int right = evaluate(expr.right);

        // perform the operation at this node
        String op = expr.data;
        switch (op) {
            case "+":
                return left + right;
            case "-":
                return left - right;
            case "*":
                return left * right;
            case "/":
                return left / right;
        }
        // op must be one of +, -, *, or /
        throw new UnsupportedOperationException(op + " is not a recognized operator");
    }
}

The recursive nature of binary trees makes recursive methods like these a natural fit. Need something related to the left or right child? Remember that these children are actually the roots of their own subtrees. Thus, whatever operation we're performing on the current node, we can recursively perform on the child nodes.

But what about that height method I call in the evaluate method above? Turns out that is nicely recursive as well.

int height(BinaryTreeNode<E> node) {
    // base case: an empty tree has height 0
    if (node == null) {
        return 0;
    // another base case: a leaf node (one with no children) has height 0
    } else if (node.left == null && node.right == null) {
        return 0;
    } else {
        // otherwise, we know that the height of this node is one more 
        // than the height of its children
        // since the children are also BinaryTreeNodes, we can use
        // this method to recursively get their heights
        // and then just add one to whichever is greater
        return 1 + Math.max(height(node.left), height(node.right));
    }
}

7 Practice Problems1

  1. Can a tree have no root? Can a tree have no leaves?
  2. Can a binary tree have more leaves than non-leaf nodes? Can it have more interior nodes than leaves?
  3. In a binary tree, which node (or nodes) have greatest height?
  4. Why are arithmetic expressions naturally stored in binary trees?
  5. Could a linked list be considered a binary tree?
  6. Recursion is used to compute many properties of trees. What portion of the tree is usually associated with the base case?
  7. Draw an expression tree for each of the following expressions.

    1. \(1\)
    2. \(1 + 5 * 3 - 4/2\)
    3. \(1 + 5 * (3 - 4)/2\)
    4. \((1 + 5) * (3 - 4/2)\)
    5. \((1 + (5 * (3 - (4/2))))\)

    Circle the nodes that are ancestors of the node containing the value 1.

  8. Given the implementation of BinaryTreeNode above, could we write a int depth(BinaryTreeNode<E> node) method that returns the depth of node? If not, what might we add to BinaryTreeNode to enable this?

Footnotes:

1

Solutions:

  1. Yes, in both cases, but only if the tree is empty.
  2. Yes, but barely. If every non-leaf node has 2 children there will be one more leaf than non-leaves. A spindly tree—a tree with no nodes of degree 2—can have as many interior nodes as it wants, but it always has exactly one leaf.
  3. The root has the greatest height, if there is a root.
  4. Because the operations in arithmetic expressions usually involve at most two operands. Each operand is represented by an independent subtree of an operator node.
  5. Yes, a linked list can be viewed as a binary tree where the head is the root, the tail is a leaf, and the other nodes are interior tree nodes.
  6. Usually the base case is found near the leaves. The recursion starts at the root, and follows the branches outward. It stops when it gets to the leaves.
    1. 1
      
    2.      (-)
       (+)      /
      1   *    4 2
         5 3
      
    3.    (+)
      1       /
            *   2
          5   -
             3 4
      
    4.      (*)
       (+)      -
      1   5   3   /
                 4 2
      
    5.   (+)
      1     * 
          5   -
            3   /
               4 2
      
  7. We would not be able to write a depth method because we have no way, given a node, to navigate from that node to the root of the tree. We would need to modify the BinaryTreeNode class to contain a reference to its parent node in order to write such a method.