Unit 6 - Trees
Table of Contents
- Overview: Binary Trees, Heaps
- The Tree as an Abstract Data Type
- Implementing a Binary Tree w/ Python Lists
- Implementing a Binary Tree w/ Nodes
- Binary Tree Applications
- Traversing a Binary Tree
- Intro to Heaps
- Binary Search Tree
6.1. Overview - Trees
Up to this point we've examined a few common ways of storing data and manipulating that data. One common data storage structure is simply the variable, which references some single piece of data, a value or an object of some type.
A list is another type of data structure, built into Python, although we've constructed our own lists based on Node
objects. We've looked at dictionaries for storing information, and built Stack
s, Queue
s, and Deque
s.
Another type of data structure—an Abstract Data Type (ADT)—is enormously useful for more complex problems: the tree.
We're going to define the ADT for the tree, and then implement it using a couple of different strategies: using Python lists, and then using a network of Node
s.
Then we'll take a look at some of the ways we can use trees to solve problems.
Let's get started!
6.2. The Tree as Abstract Data Structure
A tree is a hierarchical data structure that consists of a single root node with some number of edges connecting it to other nodes lower down in the hierarchy.
A common example of a tree is the file system on a computer, which begins with a root directory that contains other files and subdirectories. Here's a partial diagram of some of the directories and files on my computer:
And here's what that structure looks like, written as a tree:
There are lots of other examples of tree structures: the Linnaean classification system for organisms (Kingdom - Phylum - Class - Order - Family - Genus - Species), HTML tags in a web document, etc.
Whenever you classify information into a tree structure, there's some vocabulary that you should be familiar with.
Tree vocabulary
- Node - usually consists of a name (key) and some data (a value), along with pointers to other nodes
- Edge - the line that connects two nodes. A node can have several outgoing edges, but only one incoming edge.
- Root - the single node at the top of the tree, with no incoming edges
- Path - an ordered list of nodes, all connected by edges. In the example above, the path to document1 is:
root → Users → rwhite → Documents → document1
- Siblings, Children - the nodes with edges incoming from the same parent node (one level above)
- Parent - a node that has outgoing edges to children one level below in the tree
- Leaf node - a node that has no children
- Level - for any node, calculated as the number of edges from the root node to this node. For example, document1 above has 4 edges from the root, and so is at level 4. The root node is at level 0.
- Height of a tree - the maximum number of levels in a tree
- Subtree - a given node and all of its children nodes
With this vocabulary in mind, we are in a position to define a tree.
Definition: Tree
A tree is a set of nodes along with edges connecting pairs of nodes such that:
- There is a single root at level 0 with no parent.
- Every node has a single incoming edge connecting it to its parent.
- There is a unique path from the root to each node.
Additionally, if there are no more than two children for each parent, the tree is considered a "binary tree."
We can also define a tree recursively: a tree consists of a root and zero or more subtrees, each of which is also a tree. In this definition, the root of each subtree is connected via an edge to the root of its parent.
For much of our discussion below, we'll be considering this collection of people's names, arranged in a binary tree. (The arrangement doesn't reflect any familial or social connection. The names are just there as convenient references to the nodes.)
6.2.1. Interacting with a Binary Tree
At first we're going to focus on binary trees.
As we construct and interact with a binary tree, we're going to need a set of methods.
Binary Tree methods
-
BinaryTree(val)
constructs a binary tree with the given value and no children and returns that tree -
get_root_val()
returns the root value of this particular sub-tree -
set_root_val(val)
modifies the root value of this particular sub-tree -
get_left_child()
returns the subtree that is the left child of the current root -
get_right_child()
returns the subtree that is the right child of the current root -
insert_left(val)
creates a new binary tree and installs it as the left child of the current node, moving the current child down one level in the tree -
insert_right(val)
creates a new binary tree and installs it as the right child of the current node, moving the current child down one level in the tree
6.2.2. Using methods to construct a Binary Tree
Given the API described above, what commands would we issue to construct the "Aaron Tree" above? (There are multiple strategies for doing this.)
Constructing a BinaryTree using the ADT
Create a series of binary tree commands that will construct the Aaron Tree.
Show/hide the BinaryTree
ADT commands
# There are multiple strategies for accomplishing this. # This uses a few different ideas: bt = BinaryTree("Aaron") bt.insert_left("Dylan") bt.insert_left("Bonnie") bt.get_left_child().insert_right("Elise") bt.insert_right("Charlie") right_child = bt.get_right_child() right_child.insert_left("Freddy")
6.3. Implementing a Binary Tree with Python Lists
Now that we've identified the Abstract Data Type for a binary tree we can consider some of the ways that it might be implemented in Python.
6.3.1. The Binary Tree as a list of lists
One simple way to represent a Binary Tree is as a list, where the root is the 0th element in the list, the left child is the 1st element in the list, and the right child is the 2nd element in the list. Left or right subtrees can be represented as additional lists:
r = ["Aaron", [ ], [ ]] / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ ["Bonnie", [ ], [ ]] ["Charlie", [ ], [ ]] / \ / / \ / / \ / / \ / ["Dylan", [ ], [ ]] ["Elise", [ ], [ ]] ["Freddy", [ ], [ ]] ... or ... r = ["Aaron",["Bonnie", ["Dylan",[],[]],["Elise",[],[]]], ["Charlie",["Freddy",[],[]],[]]] [ value ,[ ---------------left child-------------- ], [ ---------right child---------- ] ]
Problem: List-based Binary Tree
Write a list-based representation of a Binary Tree using the following functions:
binary_tree(val)
get_root_val(root)
set_root_val(root, new_val)
get_left_child(root)
get_right_child(root)
insert_left(root, new_branch)
insert_right(root, new_branch)
Then give your Binary Tree these commands to test your Binary Tree functions.
def main(): t = binary_tree(3) print("Instruction: t = binary_tree(3)") print("Result:", t) print("Expect: [3, [], []]") insert_left(t, 4) print("Instruction: insert_left(t, 4)") print("Result:", t) print("Expect: [3, [4, [], []], []]") insert_left(t, 5) print("Instruction: insert_left(t, 5)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], []]") insert_right(t, 6) print("Instruction: insert_right(t, 6)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], [6, [], []]]") insert_right(t, 7) print("Instruction: insert_right(t, 7)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], [7, [], [6, [], []]]]") l = get_left_child(t) print("Instruction: l = get_left_child(t)") print("Result: l =", l) print("Expect: l = [5, [4, [], []], []]") set_root_val(l, 9) print("Instruction: set_root_val(l, 9)") print("Result: l =", l) print("Expect: l = [9, [4, [], []], []]") insert_left(l, 11) print("Instruction: insert_left(l, 11)") print("Result:", t) print("Expect: [3, [9, [11, [4, [], []], []], []], [7, [], [6, [], []]]]") print("Instruction: print(get_right_child(get_right_child(t)))") print("Result:", get_right_child(get_right_child(t))) print("Expect: [6, [], []]") if __name__ == "__main__": main()
Show/hide the BinaryTree
as a list
#!/usr/bin/env python3 """ binary_tree_demo.py A simple, list-based version of a binary tree. In this example, each node consists of a list containing a value, and references to the two children: Node[ value, left_child, right_child ] """ def binary_tree(r): """Constructs a binary tree with value r and no children. """ return [r, [], []] def get_root_val(root): return root[0] def set_root_val(root, new_val): root[0] = new_val def get_left_child(root): return root[1] def get_right_child(root): return root[2] def insert_left(root, value): """Insert a new subtree of "value" as a left child. """ new_subtree = binary_tree(value) # create new node new_subtree[1] = root[1] # set current left_child to # left_child of new node root[1] = new_subtree # set root left_child reference # to this new subtree def insert_right(root, value): """Insert a new subtree of "value" as a right child. """ new_subtree = binary_tree(value) # create new node new_subtree[2] = root[2] # set current right_child to # left_child of new node root[2] = new_subtree # set root right_child reference # to this new subtree def main(): t = binary_tree(3) print("Instruction: t = binary_tree(3)") print("Result:", t) print("Expect: [3, [], []]") insert_left(t, 4) print("Instruction: insert_left(t, 4)") print("Result:", t) print("Expect: [3, [4, [], []], []]") insert_left(t, 5) print("Instruction: insert_left(t, 5)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], []]") insert_right(t, 6) print("Instruction: insert_right(t, 6)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], [6, [], []]]") insert_right(t, 7) print("Instruction: insert_right(t, 7)") print("Result:", t) print("Expect: [3, [5, [4, [], []], []], [7, [], [6, [], []]]]") l = get_left_child(t) print("Instruction: l = get_left_child(t)") print("Result: l =", l) print("Expect: l = [5, [4, [], []], []]") set_root_val(l, 9) print("Instruction: set_root_val(l, 9)") print("Result: l =", l) print("Expect: l = [9, [4, [], []], []]") insert_left(l, 11) print("Instruction: insert_left(l, 11)") print("Result:", t) print("Expect: [3, [9, [11, [4, [], []], []], []], [7, [], [6, [], []]]]") print("Instruction: print(get_right_child(get_right_child(t)))") print("Result:", get_right_child(get_right_child(t))) print("Expect: [6, [], []]") if __name__ == "__main__": main()
6.4. Implementing a Binary Tree w/ Nodes
6.4.1. Review - Nodes and Unordered Lists
You recall that earlier in the course we used the idea of a Node
which consisted of two components: a piece of data and a pointer.
class Node: """A Node object is used to create a linked list, and is characterized by a value (data), and a pointer (next) that may be set to point to the next Node object in the linked list. """ def __init__(self, data): """Creates a node with a given value, and a next pointer initialized to None. """ self.data = data self.next = None def get_data(self): return self.data def get_next(self): return self.next def set_next(self, new_next): self.next = newNext def set_data(self, new_data): self.data = new_data
We could use a series of Node
objects, one pointing to the next, to keep track of a linked list that would either be Unordered or Ordered.
Show/hide the UnorderedList
class
class UnorderedList(): """An Unordered List object refers to the first Node in a linked list, call the "head." If the list has not been created yet, the value of head is None. """ def __init__(self): self.head = None def is_empty(self): return self.head == None def add(self, item): """creates a new Node and adds it to the list""" new_node = Node(item) new_node.set_next(self.head) # newNode points to old head self.head = new_node def length(self): """Traverses through the list to identify the total number of nodes in the list. This traversal strategy is used by a number of the methods in this class. """ node_count = 0 next_node = self.head while next_node != None: node_count += 1 next_node = next_node.get_next() return node_count def search(self, item): """Returns True if the item exists in the list.""" next_node = self.head while next_node != None: if next_node.get_data() == item: return True next_node = next_node.get_next() return False def remove(self,item): """This tricky method uses a pair of variables to monitor the traversal so that we can remove the desired item from the list by taking the pointer from the previous Node and setting it to the item AFTER the one that we're removing. In this version of the UnorderedList, remove works on multiple values. """ previous = None current = self.head while current != None: if current.get_data() == item: if previous == None: self.head = current.get_next() else: previous.set_next(current.get_next()) current = current.get_next() else: previous = current current = current.get_next()
6.4.2. The BinaryTree as a collection of BinaryTree
objects
The BinaryTree ADT can, of course, be implemented in different ways. Above we used nested lists to accomplish that. Now let's try using a variation on our previous Node
class.
Our previous Node implementation only had a single pointer. We can transform that node structure to a binary tree structure simply by adding a second pointer.
Problem: Binary Tree class
Write a Python class for a Binary Tree with the following methods:
.__init__(root_value)
.get_root_val()
.set_root_val(new_val)
.get_left_child()
.get_right_child()
.insert_left(value)
.insert_right(value)
.__str__()
Once you have your BinaryTree()
class written, make sure you can use the methods of that class to create any given binary tree.
6.5. Binary Tree Applications
One use of binary trees is for parsing information: analyzing a set of data and breaking it down into its components. One can parse a sentence into its words and their associated function—nouns, verbs, adjectives, conjunctions, prepositions, subjunctive clauses, etc.—and one can parse a mathematical expression into its component operations.
Let's see how to use a binary tree to build a parse tree for mathematical expressions.
6.5.1. Fully Parenthesized Expressions
The classic mathematical order of operations ("Please Excuse My Dear Aunt Sally?") describes the order in which operations are performed when evaluating a mathematical expression:
- Parentheses
- Exponents (
**
) - Multiplication and Division
- Addition and Subtraction
So the expression 2 + 3 × 8
would evaluate to 26
, not 40
. The multiplication is performed first according the order of operations, simplifying the expression to 2 + 24
. The 2
is then added to yield 26
.
It can be a complex process, encoding the order of operations into an algorithm, so we're going to focus on evaluating fully parenthesized expressions, in which a pair of parentheses encloses every operation.
To "parenthesize" 2 + 3 × 8
, we would write:
( 2 + ( 3 × 8 ) )
or
( ( 2 + 3 ) × 8 )
depending on our desired calculation.
The inner parentheses clearly indicate that which operation is to be evaluated first.
A fully parenthesized expression
How would you fully-parenthesize the calculation of a quadratic formula discriminant = b2 - 4ac
?
Show/hide parenthesized discriminant
( ( b ** 2 ) - ( ( 4 * a ) * c ) )
6.5.2. Tokens
When we get a fully-parenthesized expression we'll need to split it up into a series of tokens, the components that together comprise the expression.
Our tokens for a simple mathematical expressions will consist of the following four types:
- the open parenthesis "(", indicating the start of a new expression
- an operand, one of the numbers that will be used in evaluating the expression
- an operator, +, -, ×, or / (in our simple example here)
- the closed parenthesis ")", indicating the completion of an expression
6.5.3. What do we do with the tokens?
Given a fully-parenthesized expression as input, then, we'll convert it into a binary tree, and then evaluate the tree by working our way through it. First, let's create the binary tree.
Parsing a parenthesized expression into a binary tree
Create an empty binary tree with a single root node that has no value. Then go through the tokens from left to right.
- If a "(" is encountered, add a new node (binary tree) as a left child and descend to it.
- If an operand is found, set the current root to that value and ascend to the parent.
- If an operator is found, set current root to that operator, add a new node as a right child and descend to it.
- If a ")" is encountered, ascend to the parent.
Following these rules with a fully parenthesized expression ( 2 + ( 3 × 8 ) )
and make sure you see how this binary parse tree would be arrived at:
6.5.4. Building a Parse Tree
One of the challenges in working our way through a parse tree like this is the idea of maintaining our sense of location. When we say "descend" from a parent to a child, we're referring to leaving the current tree and changing our focus onto one of the children, one of the subtrees. And when we ascend from a subtree to a parent, we need to know which node we're referring to.
In our parse_tree
program we'll use a Stack
to keep track of which level we're on. When we descend to a subtree, we'll push the parent onto the stack so that we won't lose track of it.
Likewise, when we ascend, we'll pop from the stack and get the parent tree as a result, and we can continue with our evaluation.
(Another strategy for managing our way through a tree could be to modify BinaryTree so that we maintain a reference to the parent of any given node. We'll examine that strategy when we look at heaps soon. For now, we'll use the Stack with our Binary Tree to see how that works!)
6.5.5. Evaluating a Parse Tree
Once the tree is built, we're only halfway through our task: we also need to use the tree to complete our analysis. There are several ways to do this, depending on the problem. Here, our recursive strategy will consist of:
- If we've reached a node on the tree that has no children (a leaf node), return the value of that node (it's an operand). [Base case]
- Otherwise, check the root value of this node (it must be an operator), and return a result based on that operator and recursive calls to the left and right subtrees for evaluation.
The next project will give us some experience with this process.
Building a Parse Tree Builder and Evaluator
Write a program parse_tree.py
that uses the BinaryTree
and Stack
classes to create a binary tree from the fully-parenthesized expression. We'll do this by pushing the current tree (current_focus
) onto a stack when we descend to its subtree to edit that subtree, then pop the stack to get back to previous parent trees when we've completed filling out the subtree.
Your program should include two functions:
build_parse_tree(expr)
which takes a fully-parenthesized expression as a string parameter and builds a binary parse tree from it, andevaluate(parse_tree)
which goes through and recursively solves the problem as represented in the tree.
6.6. Traversing a Binary Tree
In the previous parse tree project we visited all the nodes of our binary tree so that we could evaluate the expression that it represented. Our strategy was to recursively evaluate the left side of a tree, identify the operation we would be working on, and then recursively evaluate the right side of the tree. This "left child, root, right child" ordering is called a inorder traversal, because the root is evaluated in the middle of the order, after the left right child has been evaluated, but before the right child has been evaluated.
There are other strategies for traversing a binary tree as well.
- preorder = for every subtree identify the root node first, then the left child, and then the right child
- inorder = for every subtree identify the left child first, then the root, and then the right child
- postorder = for every subtree identify the left child first, then the right child, and finally the root
The pre-, in-, and post- prefixes refer to where in the traversing we identify the root of the subtree.
Traversing a Binary Tree
Write a program binary_traversal_demo.py
that:
- Creates a binary tree and fills it with values
- Runs three functions that you have written to demonstrate traversing the tree:
preorder()
,inorder()
, andpostorder()
.
Calling each of the functions causes them to recursively work through the tree, eventually returning a list of the values that result from traversing the tree in the indicated fashion.
As an example consider this binary tree:
a / \ / \ / \ / \ / \ b c / \ / \ / \ / \ d e f g / \ \ / h i j k
Calling the preorder()
function on this tree would return the result [a, b, d, h, i, e, j, c, f, g, k]
.
What do the inorder
and postorder
traversals looks like?
Show/hide the other traversals
Inorder:
['h', 'd', 'i', 'b', 'e', 'j', 'a', 'f', 'c', 'k', 'g']
Postorder:
['h', 'i', 'd', 'j', 'e', 'b', 'f', 'k', 'g', 'c', 'a']
6.7. Binary Tree as a minHeap
One type of tree is a heap which can be used for lots of different applications.
Definition: Heap
A heap is a tree-based data structure in which each parent node has a value that is less than its child nodes (in the case of the tree being a "min heap"), or in which each parent node has a value that is greater than its child nodes: a "max heap."
The example to the right is a min heap. Note that a nodes on one side may have values that are out of order in comparison with nodes on the other side. This is acceptable, as long as the ordered parent-child relationship between two nodes is maintained.
Another property of the heap is that the tree is balanced—it has approximately equal numbers of nodes on either side of the tree. It maintains this property by being complete: every level of the tree is completely filled, with the possible exception of the bottom level. This bottom level is maintained so that it is filled from left to right.
One application for a heap is a priority queue. A standard queue takes a "First In, First Out" (FIFO) approach to managing data: values or processes are dealt with in the same order that they arrived at the queue.
In a priority queue, the queue is ordered according to some priority system. An example of a priority queue is patients waiting to see a doctor in the emergency room. Although patients are generally seen in the order in which they arrive, someone with a life-threatening condition will have a higher priority, and be moved to the head of the queue to be seen earlier rather than later.
Clearly a priority queue requires ordering the items in the queue in some fashion. And although we could try to keep our queue ordered with some type of sorting algorithm, let's see how we can manage that queue with a heap.
For our example here, we'll be looking at a MinHeap. The nodes will be manipulated such the minimum values receive priority, so lower values will be kept at the top of the tree.
ADT: MinHeap
In order to be able to create and manipulate a binary heap as a binary tree, we're going to build a specialized BinaryHeap
class with the following features:
BinaryHeap()
constructs a new empty binary heapinsert(value)
places a new value in the binary heap in an appropriate locationfind_min()
returns the minimum value in the heap, but doesn't remove itdel_min()
returns the minimum value in the heap and removes it from the heapis_empty()
returns True or False depending on the state of the heapsize()
returns the number of elements in the heapbuild_heap(value_list)
creates a new heap from the given list of values
6.7.1. Implementing MinHeap
As you know, there are multiple ways that an Abstract Date Type can be implemented. We've used both a list-based strategy and a node-based strategy to examine Binary Trees.
Here, let's implement our MinHeap using a list.
MinHeap with a list
Using a list to maintain a MinHeap is easy if we use the following strategy:
- The first element in the list storing our heap—
heap_list[0]
—will not be used. We'll just put a value of0
in there. - All subsequent elements in the list—any element at index
n
—will have its children locate at2 * n
and2 * n + 1
. This allows us to easily identify the children of any given node. - Any element at index
n
can have its parent node easily determined asn // 2
(integer division).
Thus, the tree here would be representing by the following list:
heap_list = [0, 3, 7, 12, 11, 20, 18, 19, 17, 13]
You should confirm that the list strategy developed here works with this heap_list
.
6.7.1.1. How to Add to the Heap
One of the most interesting parts of managing a binary heap is the process of inserting a new value into the heap. Given the previous heap, for example, how would we add the value 10
to the heap and still maintain the heap property?
Rather than starting at the top and working our way downward (which would be logistically difficult), we're going to
- Place the new item at the end of our list, and
- "Percolate" the item upward, swapping it with its parent until it has arrived at an appropriate location.
6.7.1.2. How to Remove an Item from the Heap
Another challenge is the process of removing the minimum value from the minHeap. The smallest value is at the top, so taking that from the list is easy. But how do we ensure that the tree maintains an appropriate heap structure?
Here's our strategy:
- Store the item from the top of the heap so that we can return it later
- Move the last item in the list up to the top of the heap
- "Percolate" this item downward, swapping it with the smaller of its two children until it has arrived at an appropriate location
6.7.1.3. How to Create a Heap
Given a list of initial values, how can we place them into a heap structure? Here's one way that's relatively fast and efficient.
def buildHeap(self,a_list): i = len(a_list) // 2 self.current_size = len(alist) self.heap_list = [0] + alist[:] while (i > 0): self.percolate_down(i) i = i - 1
Now that we've discussed all the pieces, let's see if we can write the BinaryHeap
class.
Writing the BinaryHeap
class
Write a program binary_heap.py
that implements a MinHeap using a BinaryHeap
class.
The primary challenges with this assignment include:
- How do we add a new node to our heap so that it is located at a valid heap location?
- How do we adjust our heap so that it is still valid after we remove an item at the top of the heap?
- Given a pre-existing list of values, how do we efficiently assemble them into a heap?
Your program should include the BinaryHeap
class declaration as well as a main()
program that demonstrates the use of the class.
An optional template is available if you want something to help you get started.
6.8. Binary Search Tree
We've looked at a couple of ways of organizing information for easy retrieval later on.
- The
list
data type allows us to easily keep unsorted values in a linear structure, although finding a given item isn't very efficient. - A
list
of sorted values is easier to look through using a Binary Search strategy, but items have to be sorted to do that. - A
hash
ordictionary
allows us to store key-value pairs in memory, although a lot of memory is required to store a few values if we want to avoid having too many collisions.
Let's look at another way of storing data for quick retrieval, a strategy that uses Binary Trees.
6.8.1. Review of the Map
ADT
Recall the Map() Abstract Data Type that works like a dictionary in Python, with a unique key and an associated value.
The Map()
Abstract Data Type
Map()
creates a new, empty map objectput(key, value)
adds a new key-value pair to the map. Ifkey
already exists in the map, replace its previous value with the new value.get(key)
returns the value associated with the specifiedkey
del
deletes a key-value pair from the map. Usage:del map[key]
len()
returns the total number of key-value pairs in the mapin
returnsTrue
if a given key is in the map. Usage:key in map
Binary search trees exhibit the bst property, which is different from the heap structure we looked at before.
The BST property
A binary search tree exhibits the "bst property": for any given parent (key), all the keys less than that are located in the left subtree, and all the keys greater than the parent are located in the right subtree.
Take a look at the tree here to confirm that it exhibits the bst property.
To build a Binary Search Tree, begin at the top. The first value entered in the tree goes at the root location. The second value added to the tree is added as the left child or right child, depending on its value relative to the root. All subsequent values are placed as left or right children, as space is allowed in the binary structure.
How was the binary search tree above created?
To create the tree above, we can identify how the items must have been added into the tree when it was being built.
12
went in first, and became the root value of the tree. Perhaps 7
was added next, which is less than 12 so it became the left child. Maybe 11
was added next, less than 12 but greater than 7, so it was placed in level 2. At some point 18
was added, becoming the right child of 12 because it is greater than that root value. 13
or 19
might have been added next to become the left/right children, and the lower values came later, filling in appropriately.
6.8.2. Implementing the Binary Search Tree
Recall our Unordered List strategy, in which we had Node
objects with pointers to each other, and an UnorderedList
class that pointed to the first node on the list.
In a similar fashion here, we'll have the root
of the BinarySearchTree
class point to a TreeNode
object, which in turn points to other TreeNode
objects. You may recall that we've had to resort to a variety of strategies to be able to find the parent for any child: in one case we pushed the parent onto a Stack for later retrieval, and in the list-based heap, we could take the index of any item and do an integer division //
to get back to its parent. Here, our TreeNode
is going to contain an explicit reference to the parent.
6.8.2.1. The BinarySearchTree
class
Here's the start of the BinarySearchTree
class:
class BinarySearchTree: def __init__(self): self.root = None self.size = 0 def length(self): return self.size def __len__(self): return self.size def __iter__(self): return self.root.__iter__()
6.8.2.2. The TreeNode
class
And what does any given TreeNode
look like?
class TreeNode: """This class, as written, uses the fact that the Python value None has a Boolean value of False when used in boolean expressions. """ def __init__(self,key,val,left=None,right=None, parent=None): self.key = key self.payload = val self.left_child = left self.right_child = right self.parent = parent def has_left_child(self): return self.left_child # Returns None (False) if there is no left # child, otherwise, returns a reference to # the left child, which evaluates to True in # a boolean expression. Used in the # replaceNodeData method below. def has_right_child(self): return self.right_child def is_left_child(self): return self.parent and self.parent.left_child == self def is_right_child(self): return self.parent and self.parent.right_child == self def is_root(self): return not self.parent # Another way of writing this would be: # if self.parent == None: # return True # else: # return False def is_leaf(self): return not (self.right_child or self.left_child) def has_any_children(self): return self.right_child or self.left_child def has_both_children(self): return self.right_child and self.left_child def replace_node_data(self, key, value, lc, rc): self.key = key self.payload = value self.left_child = lc self.right_child = rc if self.has_left_child(): self.left_child.parent = self if self.has_right_child(): self.right_child.parent = self
The final important steps in setting up our Binary Search Tree include BinarySearchTree
methods for puting values into the tree and for getting values from the tree:
def put(self, key, val): """Puts a key-value pair into the binary tree at the correct location. Makes use of the recursive _put() helper function. """ if self.root: # if a root has been established... self._put(key, val, self.root) # call the recursive form of put else: self.root = TreeNode(key,val) # otherwise, place key-value here self.size = self.size + 1 # Either way, we've increment size def _put(self, key, val, current_node): """This beautiful and elegant recursive method takes the key-value pair and the currentNode, and works its way down through the tree until we find the correct location to put a TreeNode with the key-value. Note the _ in front of the method name. This is a "code" for indicating that this method is meant for internal use (by the class itself) only, and not used by any external callers. """ if key < current_node.key: if current_node.has_left_child(): self._put(key, val, current_node.left_child) else: currentNode.left_child = TreeNode(key, val, parent=current_node) else: if current_node.has_right_child(): self._put(key, val, current_node.right_child) else: current_node.right_child = TreeNode(key, val, parent=current_node) def get(self, key): """This tries to find the TreeNode associated with a given key. If we have a root value, call the recursive _get() helper function and store the resulting value in res for return. """ if self.root: res = self._get(key, self.root) if res: return res.payload else: return None else: return None def _get(self, key, current_node): """Another beautiful recursive method that digs in to find a TreeNode with the given key. Note the _ in front of the method name. This is a "code" for indicating that this method is meant for internal use (by the class itself) only, and not used by any external callers. """ if not current_node: return None elif current_node.key == key: return current_node elif key < current_node.key: return self._get(key, current_node.left_child) else: return self._get(key, current_node.right_child)