The main topics of this worksheet are:
Project 3 will focus on tree-like data structures, so it is very important to get practice with this material!
These things might be helpful while working on the problems. Remember that for worksheets, we don't strictly limit what resources you can consult, so these are only suggestions.
The most useful files from the sample code repository are:
I've made one addition to trees.py
since Lecture 18 that's relevant to this worksheet.
Previously, a Node
object only knew about its children, and had no way to find its parent node (if any). This meant that it was possible to descend into a tree, but was not possible to start at an internal node and head toward the root.
To address this, I've added an attribute parent
to the class Node
, and made sure that this attribute is set whenever child nodes are created. So while before a child could be created with just
# Make a left child (which has no idea we're its parent)
node = Node()
self.left = node
now it is necessary to do this:
# Make a left child (that knows we are its parent)
node = Node(parent=self) # New node that thinks we are its parent
self.left = node # Make that node our child
or to create the Node
without a parent but then set it directly, i.e.:
# Make a left child (that knows we are its parent)
node = Node()
self.left = node # set left child to node
node.parent = self # tell node we are its parent
Suppose you have two Node
objects (or objects of a subclass of Node
) as defined in (https://github.com/emilydumas/mcs275spring2023/blob/main/samplecode/datastructures/trees.py). Let's call them x
and y
. You might ask how they are related to one another, if it all. In general, exactly one of the following statements will be true:
Write a function node_relationship(x,y)
that takes two nodes and returns one of these strings:
"equal"
"ancestor"
"cousin"
"unrelated"
according to which of the cases listed above describes the relationship between x
and y
.
(Note that the use of "cousin" to describe case (3) doesn't fit the genealogical analogy perfectly, as it would also include the case of siblings.)
Also, make test cases for your function that generate each of the possible answers.
This solution makes use of a helper function ancestors
, which lists all ancestors of a node x
, starting from the node itself and ending with the root of the tree.
def ancestors(x)
"""
Given node, returns list containing its parent, its parent's parent, etc.
"""
L = [x]
while L[-1].parent != None: # While we can find a parent of the most recently-found parent
L.append(L[-1].parent) # Put parent into the list
return L
def node_relationship(x,y):
"""Returns string representing relation between nodes `x` and `y`."""
if x == y:
return "equal"
elif x in ancestors(y):
return "ancestor"
elif y in ancestors(x):
return "ancestor"
elif ancestors(y)[-1] == ancestors(x)[-1]: # If both share a root
return "cousins"
else:
return "unrelated"
trees.py
¶The rest of the worksheet focuses on adding features to the module trees.py
from lecture. To prepare for that, download these files and save them in a directory where you'll work on worksheet 8.
It will be hard to test your code if you don't have any search trees to test it with. To that end, I created a module treeutil.py
(which you were asked to download above) that will generate random binary search trees on request. It can also give a random node of the tree, or a random pair of nodes.
Open the full documentation of the module in a separate browser tab, and keep it open as you work:
The main things you'll be doing in such tests are:
treeutil
functions to make trees to test your code ontreevis.treeprint
to display trees for manual inspectionAdd two new methods, minimum
and maximum
to the BST
class for finding the node with the smallest or largest key in the subtree that has this node as its root.
# These should go inside class `BST` in `trees.py`
def minimum(self):
"""
In the subtree that this node is the root of, find and
return the the smallest key.
"""
if self.left != None:
return self.left.minimum()
else:
return self.key
def maximum(self):
"""
In the subtree that this node is the root of, find and
return the the largest key.
"""
if self.right != None:
return self.right.maximum()
else:
return self.key
Add a method depth
to the BST
class that computes the depth of the tree.
# This should go inside class `BST` in `trees.py`
def depth(self):
"""
Return the length (in number of edges) of the longest descending
path starting from this node.
"""
# Base case: if there are no child Nodes, cannot go any further down
if self.left == None and self.right == None:
return 0
# Cases where only one child Node is defined
if self.right == None:
return self.left.depth() + 1 # Add 1 to account for increased depth
elif self.left == None:
return self.right.depth() + 1 # Add 1 to account for increased depth
# Last case: both left and right child Nodes are defined
else:
# Note this is the built-in `max` function, not the one we wrote in Q2
return max(self.left.depth(), self.right.depth()) + 1
Suppose you are given a binary search tree T
and a node x
that has no children. If you change the key of x
, the tree may or may not be a binary search tree.
Here's an example to illustrate this. Consider this BST:
[15]
[9] [29]
[5] [12]
If x
is the node with key 12, then changing the key to 13 still gives a binary search tree:
[15]
[9] [29]
[5] [13]
But if we change the key to 6, the result is not a binary search tree; the right subtree of the node with key 9 contains keys smaller than 9:
❌THIS IS NOT A BST❌
[15]
[9] [29]
[5] [6]
And if we change the key to 18, the result is not a binary search tree. The left subtree of the root node contains a node with key larger than 15:
❌THIS IS NOT A BST❌
[15]
[9] [29]
[5] [18]
Now, if you look more closely at this example, you can convince yourself that the key 12 can be changed to any number in the closed interval $[9,15]$ while keeping the BST condition, but that anything outside of this range will result in a non-BST. This is called the interval of the node.
Add a method to BST
that can be called whenever the node has no children, and which returns a tuple (kmin,kmax)
where kmin
and kmax
are the smallest and largest values (respectively) that the key of that node could be changed to while keeping the BST condition. In some cases, there may be no lower or upper limit, in which case the value None
should be returned for kmin
, kmax
, or both. If called on a node that has children, it should raise an exception.
Use this method definition:
# This should go inside class `BST` in `trees.py`
def interval(self):
"""
If this node has no children, return the minimum and maximum key values that
could be given to this node without violating the BST condition. The value None
is used to indicate that the range of allowable values has no upper or lower
bound.
"""
For example, in this tree:
[15]
[9] [29]
[5] [12]
calling .interval()
on its nodes will result in the following return values:
[15] : (raises exception because the node has children)
[9] : (raises exception because the node has children)
[5] : None,9
[12] : 9,15
[29] : 15,None
Finally, it may be helpful in this problem to recall that functions in Python can return multiple values, and if they do, the returned values are assembled into a tuple. That means the two statements below do the same thing:
return 1,2,3
return (1,2,3)
def interval(self):
"""
If this node has no children, return the minimum and maximum key values that
could be given to this node without violating the BST condition. The value None
is used to indicate that the range of allowable values has no upper or lower
bound.
"""
if self.left != None or self.right != None:
raise ValueError("Method `interval` may only be used on nodes with no children")
# Left and right sides of interval to be returned
interval_left = None
interval_right = None
if self.parent == None: # If we are at the root of the tree, then interval is unlimited
return (None, None)
# If current node is to the parent's left
if self.parent.left == self:
interval_right = self.parent.key # Upper bound is given by parent's key
if self.parent.parent != None: # If grandparent exists, it might be the lower bound
grandparent = self.parent.parent
# Check if parent is to the right of its own parent
if grandparent.right.key == self.parent.key:
# If so, then parent's parent's key is lower bound
interval_left = grandparent.key
# Else, current node is to the parent's right
else:
interval_left = self.parent.key # Lower bound is given by parent's key
if self.parent.parent != None: # If grandparent exists, it might be the upper bound
grandparent = self.parent.parent
# Check if parent is to the left of its own parent
if grandparent.left.key == self.parent.key:
# If so, then parent's parent's key is upper bound
interval_right = grandparent.key
return (interval_left, interval_right)
This is an extra activity to work on if you finish all the exercises above.
Given a node x
in a BST, you might want to know its successor, meaning the node whose key is larger than x
but as small as possible given that condition. (We'll assume in this problem that all nodes in the tree have distinct keys).
It turns out that there is an efficient description of the successor as follows:
x
has a right child, then the successor is the smallest key in the right subtree of x
x
does not have a right child, but has a successor, then the successor is an ancestor of x
. If you visit these ancestors by traveling upward from x
, then the first node you reach that is the left child of its parent is the successor.x
has no right child, and if moving up from x
to the root only ever visits nodes that are right children of their parents, then x
is the maximum of the tree, so it has no successor.Write a method successor
of class BST
that finds and returns the successor of a node, or returns None
if there is no successor.
successor
is useful¶In a BST it is easy to delete a node that has zero or one children. In case it has one child, you can just "promote" the child to take the place of the deleted parent.
If you want to delete a node with two children from a BST, you can do so by finding the successor of the node, removing it, and then letting the successor take the place of the node with two children. Since the successor cannot have two children, removing the successor falls under the "easy" case described above.