Unit 7 - Graphs

Overview: Graphs
Graph Applications
Breadth-First Search
Depth-First Search
Generalized Depth-First Search

7.1. Overview - Graphs

We've been studying trees for a bit now, which is one way of connecting related bits of data to each other. The parent-child relationship is a pretty intuitive one, and our binary tree, in which each parent can have a left and right child, is useful for a number of problem-solving situations.

Trees are actually a specialized form of a more generic way of describing data and the connections between them: the graph.

Definition: Graph

A graph in mathematics and computer science is a collection of data that are connected to each other in some way. Each piece of data in the graph is called a node or a vertex, while the connections themselves, usually drawn as lines connecting the vertices, are called edges.

Examples:

Webpages can be thought of as nodes, with their edges consisting of the hyperlinks that connect one page to another.
People are connected by their relationships with each other.
Airports are connected by flight paths.
Chess game positions are connected by the moves of the game pieces.

Graphs can be a bit more complicated than trees, but it's worth the trouble, because graphs can be constructed to help us solve all sorts of interesting problems.

7.1.1. Vocabulary

There are a few simple terms that we'll use in discussing graphs.

Graph Vocabulary

Vertex - aka. a node. Has a unique key that identifies the vertex, and often an associated value, sometimes called a payload.
Edge - An edge connects two vertices to indicate a relationship between them. Can be one-way ("directed") or two-way. A graph with only one way connections is called a "directed graph," or digraph.
Weight - An edge-associated value that can represents something about the connection, perhaps a "cost" or a "strength."
Path - A path is a sequence of vertices connected by edges. The unweighted path length of a path is simply the number of edges, or n - 1 where n = the number of nodes in the path. The weighted path length is the sum of the weights of the edges in the path.
Cycle - A closed path that starts and ends at the same vertex. A graph with no cycles is acyclic, and a directed graph with no cycles is a directed acyclic graph (DAG).

In the example shown here, relationships between Richard, Ruth, and their friends Aaron and Karen are represented by a graph with the weight of those relationships indicated.

7.1.2. The `Graph` Abstract Data Type

We'll have a number ways of interacting with our Graph. First let's define the Abstract Data Type, then we'll take a look at a way we can implement the ADT in Python.

The `Graph` ADT

Graph() - creates a new empty graph
add_vertex(new_vertex) - adds a new Vertex object to the Graph
add_edge(from_vertex, to_vertex) - creates a new edge connecting two vertices
add_edge(from_vertex, to_vertex, weight) - creates a new edge with a weight
get_vertex(vertex_key) - returns the Vertex object identified by vertex_key.
get_vertices() - returns a list of all vertices in the Graph
in - returns True in expressions of the form vertex in graph.

7.1.3. Implementing the `Graph` ADT in Python

Although one can use a sparse array to represent the vertices and edges in a graph, we're going to use an "Adjacency List" to represent our graphs. In this strategy:

A Graph object keeps track of a list of Vertex objects and the edges that connect them.
Each Vertex object keeps track of the other vertices it's connected to, as well as the weight of those connections.

Graphs are made up of connections between vertices, so before we develop our Graph class, let's develop a Vertex class.

7.1.3.1. The `Vertex` class

Let's start with the Vertex first.

#!/usr/bin/env python3
"""
vertex.py
Describes the Vertex class
"""

class Vertex(object):
    """Describes a vertex object in terms of a "key" and a
    dictionary that indicates edges to neighboring vertices with
    a specified weight.
    """
    def __init__(self, key):
        """Constructs a vertex with a key value and an empty dictionary 
        in which we'll store other vertices to which this vertex is connected.
        """
        self.id = key
        self.connected_to = {}   # empty dictionary for neighboring vertices

    def add_neighbor(self, neighbor_vertex, weight=0):
        """Adds a reference to a neighboring Vertex object to the dictionary, 
        to which this vertex is connected by an edge. If a weight is not indicated, 
        default weight is 0.
        """
        self.connected_to[neighbor_vertex] = weight

    def __repr__(self):
        """Returns a representation of the vertex and its neighbors, suitable for 
        printing. Check out the example of 'list comprehension' here!
        """
        return str(self.id) + ' connected_to: ' + str([x.id for x in self.connected_to])

    def get_connections(self):
        """Returns the keys of the vertices we're connected to
        """
        return self.connected_to.keys()

    def get_id(self):
        """Returns the id ("key") for this vertex
        """
        return self.id

    def get_weight(self, neighbor_vertex):
        """Returns the weight of an edge connecting this vertex with another.
        """
        return self.connected_to[neighbor_vertex]

Once the Vertex class is written we can create two vertexes connected to each other:

v1 = Vertex("Pasadena")
v2 = Vertex("Santa Monica")
v1.add_neighbor(v2, 90)
v2.add_neighbor(v1, 120)

This very simple graph be a way of representing how much time it takes for me to commute from one community to the other, with the weight representing a travel time in minutes. A physical representation of the graph might be something like this:

Problem: Draw this Graph

What does the graph look like for this arrangement of vertices?

v1 = Vertex("Richard")
v2 = Vertex("Susie")
v3 = Vertex("Wayne")
v1.add_neighbor(v2)
v1.add_neighbor(v3, 4)
v2.add_neighbor(v3)
v2.add_neighbor(v1, 3)

Show/hide graph

Richard connected_to: ['Susie', 'Wayne']
Susie connected_to: ['Wayne', 'Richard']
Wayne connected_to: []

7.1.3.2. The `Graph` class

Our Vertex class has been designed so that each vertex knows who its neighbors are, but the Graph class will have methods that allow us to manage those vertices and their connections.

#!/usr/bin/env python3
"""Implements a graph of vertices"""
                        
from vertex import Vertex

class Graph(object):
    """Describes the Graph class, which is primarily a dictionary
    mapping vertex names to Vertex objects, along with a few methods
    that can be used to manipulate them.
    """
    def __init__(self):
        """Initializes an empty dictionary of Vertex objects
        """
        self.graph = {}

    def add_vertex(self, key):
        """Creates a new "key-value" dictionary entry with the string "key"
        key as the dictionary key, and the Vertex object itself as the value.
        Returns the new vertex as a result.
        """
        new_vertex = Vertex(key)
        self.graph[key] = new_vertex
        return new_vertex

    def get_vertex(self, key):
        """Looks for the key in the dictionary of Vertex objects, and
        returns the Vertex if found. Otherwise, returns None.
        """
        if key in self.graph.keys():
            return self.graph[key]
        else:
            return None

    def __contains__(self, key):
        """This 'dunder' expression is written so we can use Python's "in"
        operation: If the parameter 'key' is in the dictionary of vertices,
        the value of "key in my_graph" will be True, otherwise False.
        """
        return key in self.graph.keys()

    def add_edge(self, from_key, to_key, weight=0):
        """Adds an edge connecting two vertices (specified by key
        parameters) by modifying those vertex objects. Note that
        the weight can be specified as well, but if one isn't
        specified, the value of weight will be the default value
        of 0.
        """
        # if the from_key doesn't yet have a vertex, create it
        if from_key not in self.get_vertices():
            self.add_vertex(from_key)
        # if the to_key doesn't yet have a vertex, create it
        if to_key not in self.get_vertices():
            self.add_vertex(to_key)
        # now we can create the edge between the two
        self.get_vertex(from_key).add_neighbor(self.get_vertex(to_key), weight)

    def get_vertices(self):
        """Returns a list of the Graph's Vertex keys"""
        return self.graph.keys()

    def __iter__(self):
        """Another 'dunder' expression that allows us to iterate through
        the list of vertices.
        Example use:
        for vertex in graph:  # Python understands this now!
            print(vertex)
        """
        return iter(self.graph.values())

def main():
    g = Graph()
    g.add_edge("V0", "V1", 2)
    g.add_vertex("V2")
    g.add_edge("V0","V2",12)
    for vertex in g:
        print(vertex)

if __name__ == "__main__":
    main()

7.2. Overview - Graph Applications

Now that we have Graph and Vertex classes set up, let's see if we can use them.

Developing Graph-based Solutions

When using a Graph to solve a problem, there are usually three steps in the process:

create the graph that represents the data and their relationships
analyze/explore the graph with your goal in mind, usually by identifying a starting location on the graph and searching through the nodes
performing a final traverse of the nodes to reveal the solution

We'll see how those two processes work in a variety of contexts.

Warm-up: Create this Graph

Write a main() program that creates this graph and prints out its Python representation.

Show/hide the program

def main():
    g = Graph()             # Create an empty graph
    for i in range(6):      # Create a series of empty vertices with
        g.add_vertex(i)      # simple keys (the numbers 0-5)
    
    g.add_edge(0,1,2)        # Start adding edges connecting the vertices
    g.add_edge(0,5,12)
    g.add_edge(1,2,1)
    g.add_edge(1,3,5)
    g.add_edge(3,2,7)
    g.add_edge(2,4,2)
    g.add_edge(4,3,4)
    g.add_edge(5,4,6)
    
    # Here's one way of printing out the Graph
    # This is possible because we implemented __iter__
    for v in g:
        print(v)
    print("\n--------------\n")
    # Here's another way of printing out the graph
    print(" {0:5s} | {1:5s} | {2:7s}".format("From","To","Weight"))
    for v in g:
        for w in v.get_connections():
            print("{0:^5s} | {1:^5s} | {2:7d}".format(str(v.get_id()), str(w.get_id()), v.get_weight(w)))

if __name__ == "__main__":
    main()

7.3. Breadth-first Search (BFS)

We have the ability to create a directed graph now, which consists of edges (possibly weighted) connecting Vertex objects.

Once the graph has been constructed, there are two common algorithms for moving through the a graph:

performing a breadth-first search across the vertices, in which all of a given vertex's connections are explored before descending further down into the graph
performing a depth-first search through a series of connected vertices, exploring a chain of connected vertices to the end before coming back to the next connection

Which search strategy you employ depends upon the problem you are trying to solve.

Algorithm: Breadth-First Search

Given a graph and a starting vertex start:

Place this starting vertex start in a queue.
Find all the neighbor vertices connected by an edge to start.
Place each of these vertices into the queue for further exploration.
Continue until all connections have been explored.

Using a family tree as an analogy, a breadth-first search would use a parent (node) to look at the children (nodes), and we would examine all of the child nodes before moving on to look at the grandchildren, and we would examine all of these before continuing on to the great-grandchildren.

7.3.1. Enhancing the Vertex class

It will be necessary for us to keep track of vertices that we've already explored (perhaps by arriving at them from a previous path), so we'll keep track of the state of every node as either "unexplored", "being explored", or "fully explored". It's common to consider the nodes as having a virtual color to identify these states: perhaps using a "white" node for unexplored, "gray" for being explored, and "black" for fully explored.

It will also be helpful to keep track of distances traveled as we traverse the graph, so we'll include instance variables for tracking that as well.

Finally, as we explore a vertex it will become advantageous for us to have a "Back button" that allows us to backtrack to the preceding vertex (without having to use a stack).

Let's modify the Vertex class to accommodate these new features:

class Vertex(object):
    """Describes a vertex object in terms of a "key" and a
    dictionary that indicates edges to neighboring vertices with
    a specified weight.
    """
    
    def __init__(self, key):
        """Constructs a vertex with a key value and an empty dictionary 
        in which we'll store other vertices to which this vertex is
        connected.
        """
        self.id = key
        self.connected_to = {}   # empty dictionary for neighboring vertices
        self.color = 'white'
        self.distance = 0
        self.predecessor = None
        self.discovery_time = 0     # discovery time
        self.finish_time = 0        # finish time  
    
    def add_neighbor(self, neighbor_vertex, weight=0):
        """Adds a reference to a neighboring Vertex object to the
        dictionary, to which this vertex is connected by an edge. 
        If a weight is not indicated, default weight is 0.
        """
        self.connected_to[neighbor_vertex] = weight
    
    def set_color(self, color):
        self.color = color
    
    def get_color(self):
        return self.color
    
    def set_distance(self, distance):
        self.distance = distance
    
    def get_distance(self):
        return self.distance
    
    def set_pred(self, predecessor):
        self.predecessor = predecessor
    
    def get_pred(self):
        return self.predecessor
    
    def set_discovery(self, discovery_time):
        self.discovery_time = discovery_time
    
    def get_discovery(self):
        return self.discovery_time
    
    def set_finish(self, finish_time):
        self.finish_time = finish_time
    
    def get_finish(self):
        return self.finish_time
    
    def __repr__(self):
        """Returns a representation of the vertex and its neighbors,
        suitable for printing. Check out the example of 'list
        comprehension' here!
        """
        return 'Vertex[id=' + str(self.id) \
                + ',color=' + self.color \
                + ',dist=' + str(self.distance) \
                + ',pred=' + str(self.predecessor) \
                + ',disc=' + str(self.discovery_time) \
                + ',fin=' + str(self.finish_time) \
              + '] connected_to: ' + str([x.id for x in self.connected_to]) 
    
    def get_connections(self):
        """Returns the keys of the vertices we're connected to
        """
        return self.connected_to.keys()
    
    def get_id(self):
        """Returns the id ("key") for this vertex
        """
        return self.id
    
    def get_weight(self, neighbor_vertex):
        """Returns the weight of an edge connecting this vertex 
        with another.
        """
        return self.connected_to[neighbor_vertex]

7.3.2. The Word Ladder Game

Here, we'll use a breadth-first search (BFS) to solve the Word-Ladder problem.

A Word Ladder

Using series of words, transform the initial word "BEAR" into the word "LION", changing just one letter at a time between words.

BEAR
BOAR
BOOR
.
.
.

That's the Word Ladder game.

To solve a Word Ladder problem algorithmically, we will:

create a graph of words that are related to each other according to the rules of the game: each word will be a vertex that is connected to other words that vary from the original by a single letter
Then we can follow a path from one word to another to identify a solution

How are we going to create the graph of words, though?

7.3.2. Building Graph of Related Words

There a variety of strategies that we might use to identify words that are connected to each other.

One strategy would be to begin with a word, and then check every other word in the word list to see if it differs by one letter. For every word that matches that description, we'd need to then go through every word again to find those that differ by one letter.

Building a Graph of Three-Letter Words

Write a function build_graph(wordFile) that takes a file containing only 3-letter words, builds a graph of all the words, and then creates an edge between all the words that differ from other words by only one letter.

Show/hide code

def build_graph():
    g = Graph()
    infile = open('three-letter_words.txt', 'r')
    # create vertices
    for line in infile:
        word = line.rstrip()
        g.add_vertex(word)
    # connect vertices
    for v1 in g:
        for v2 in g:
            if off_by_exactly_one(v1.get_id(), v2.get_id()):
                g.add_edge(v1.get_id(), v2.get_id())
    
    return g

7.3.4. Breadth-First Searching

A graph connecting all the words that differ from each other by one letter might start out looking something like this:

AAHS connected_to: ['DAHS', 'FAHS', 'HAHS', 'LAHS', 'PAHS', 'RAHS', 'YAHS', 'AALS']
DAHS connected_to: ['AAHS', 'FAHS', 'HAHS', 'LAHS', 'PAHS', 'RAHS', 'YAHS', 'DABS', 'DADS', 'DAES', 'DAGS', 'DAIS', 'DAKS', 'DALS', 'DAMS', 'DANS', 'DAPS', 'DAWS', 'DAYS', 'DAHL', 'DOHS']
FAHS connected_to: ['AAHS', 'DAHS', 'HAHS', 'LAHS', 'PAHS', 'RAHS', 'YAHS', 'FAAS', 'FABS', 'FADS', 'FAGS', 'FANS', 'FARS', 'FATS', 'FAWS', 'FAYS', 'FEHS', 'FOHS']

Now how do we go about looking through that graph to find the path we want?

7.3.5. Implementing the breadth-first search

With the extended version of the Vertex class we can implement a Breadth-First Search algorithm.

def breadth_first_search(graph, start):
    """Performs a BFS on `graph` beginning at `start` vertex, creating
    a series of predecessor links between the vertices.
    """
    start.set_distance(0)
    start.set_pred(None)
    queue = []
    queue.append(start)
    while len(queue) > 0:
        current_vertex = queue.pop(0)
        for neighbor in current_vertex.get_connections():
            if neighbor.get_color() == 'white':
                neighbor.set_color('gray')
                neighbor.set_distance(current_vertex.get_distance() + 1)
                neighbor.set_pred(current_vertex)
                queue.append(neighbor)
        current_vertex.set_color('black')

You'll understand this algorithm much better if you take a moment to trace through a sample graph.

7.3.5. Traversing the Graph

Finally, once the breadth-first search has been performed, the graph will have the important vertices identified with a non-None predecessor. We can traverse the graph from any valid vertex in the graph to get back to the starting position, identifying the complete path along the way.

def traverse(final):
    current = final
    while current.get_pred() != None:
        print(current.get_id())
        current = current.get_pred()
    print(current.get_id())

Run a few experiments to see how the algorithm works. Put some print statements in the program to get a better idea of what's going on.

What happens when there isn't a path from one work to another? How could you modify the program to handle that issue?

Are their modifications that would make the program better? Is there a way to print out the ladder first-to-last rather than last-to-first?

7.4. Depth First Searching

In the previous example, we looked through the graph one level at a time, cataloging each level of vertices in the graph as we proceeded, looking for a path of vertices that would provide a solution for us.

This breadth-first search strategy is appropriate for some types of problems. Others are better solved using a depth-first search strategy, in which valid paths are pursued to the end as they are encountered.

Let's see how that works.

7.4.1. The Knight's Tour Problem

A knight in chess is able to move in its odd L-shaped pattern, 2 steps in one direction and 1 more step either to the left or right from that direction. A "tour" of the knight is one in which is manages to visit every single location in a square range of locations.

A knight on square 12 has legal moves as shown. Picture from Miller and Ranum's Problem Solving with Algorithms and Data Structures using Python.

We'll begin our solution by creating a graph of the possible moves of the knight. Then we'll perform a depth-first search of that graph to produce a solution.

7.4.2. Creating a graph of the Knight's possible moves

In our graph, each position on the board will be represented by a vertex, and each edge between those vertices will represent a move of the knight.

def knightGraph(board_dimension):
    """Sets up a graph of legal knight moves on a board of a given
    size.
    """
    g = Graph()
    for row in range(board_dimension):
        for col in range(board_dimension):
            # calculate a nodeID using row and column
            # standard strategy for converting x-y values to
            # a single value
            nodeID = row * board_dimension + col
            # get the legal moves from here and store them in 
            # a list
            next_moves = find_legal_moves(row, col, board_dimension)
            for move in next_moves:
                moveID = move[0] * board_dimension + move[1]
                g.add_edge(nodeID, moveID)
    return g

The question of how to find those legal moves is relatively easy to solved.

def find_legal_moves(row, col, dimension):
    """Returns a list of row-col locations that:
        a) follow a knight's move ("up, up, over"), and
        b) are still on the board
    """
    legal_moves = []
    # describe offsets by row, col
    offsets = ( (2, 1), (2, -1), (-2, 1), (-2, -1), 
                (1, 2), (1, -2), (-1, 2), (-1, -2))
    for offset in offsets:
        r = row + offset[0]
        c = col + offset[1]
        if is_legal(r, dimension) and is_legal(c, dimension):
            legal_moves.append((r, c))
    return legal_moves
        
def is_legal(value, dimension):
    """Checks to see if the value is on the board (between 0 inclusive
    and dimension exclusive)
    """
    return value >= 0 and value < dimension

Okay, with these sections of code written, we have a graph with all of the possible knight positions, and edges connecting each of those positions with a legal move to another position.

One interesting thing we can do is view that graph.

7.4.3. Performing the tour with a depth-first search

def dfs(depth, vertex_path, current, limit):
    current.set_color('gray')
    vertex_path.append(current)
    if depth < limit:       # we haven't gotten all the squares
        neighbor_list = list(current.get_connections())
        i = 0
        done = False
        while i < len(neighbor_list) and not done:
            if neighbor_list[i].get_color() == 'white':     # haven't been here yet
                done = dfs(depth + 1, vertex_path, neighbor_list[i], limit)
            i += 1
        if not done: # We've gone through all the possibilities and haven't finished
                     # so we're going to have to backtrack
            vertex_path.pop()
            current.set_color("white")
    else:
        done = True
    return done

And now, all that's left to do is to initialize the board and initiate the search for a solution.

def main():
    DIM = 8     # 8x8 chessboard
    g = knightGraph(DIM)
    vertex_path = []
    # Specify which vertex to begin from: 
    # 0 is in corner, 1 is in bottom row 
    # next to it, etc.
    start_vertex = g.get_vertex(0)
    start_time = time.time()
    dfs(0, vertex_path, start_vertex, DIM*DIM - 1)
    stop_time = time.time()
    print("Searching finished")
    print([e.id for e in vertex_path])
    print("Elapsed time: " + str(stop_time - start_time) + " seconds")

Recursively searching solutions on an 8x8 chessboard might take a while. If you get tired of waiting for your program to finish, you might try running a smaller chessboard (6x6?) and seeing if that will finish in a more reasonable amount of time.

It's interesting to view the search as it takes place. In the dfs function, right after the vertex current is appended to the vertex_path, included the following lines:

# Print the path to this point
os.system("clear")
print([el.get_id() for el in vertex_path])

Having to print significantly slows down the recursion process, however, so you probably don't want to leave this code in there. Comment it out after you're done investigating.

7.4.4. Improving the depth-first search with a heuristic

The graph of possible moves is static—it describes all the possible moves between vertices, and there are no improvements to make there.

But we can improve the performance of the search by programming additional consideration into the traversal. For example, which position is it easier to find a legal move from, near the middle of the board or near the edge of the board?

The edge of the board has fewer possibilities, so perhaps it makes sense to try moving through those locations first, leaving the positions at the middle of the board for later on when they'll be more likely to have a move available.

Improving the `neighbor_list`

In the depth-first search function dfs, replace this line

neighbor_list = list(current.get_connections())

with this one:

neighbor_list = list(orderByAvail(current))

The orderByAvail function looks through the connections for a given vertex, checks to see how many connections each of them has, and puts them in order from least number of connections to most, so that vertices with fewer connections are considered before ones with more connections.

def orderByAvail(vertex):
    """What's going on in here?!
    This fancy method takes a vertex, looks at its neighbors,
    and identifies which of them have the *fewest* possible
    moves from there. By choosing to move to a neighbor with
    fewer possibilities *now*, we'll leave more flexibility
    for our next vertex choices later on.

    Getting this heuristic to work requires some interesting
    coding!
    """
    neighbor_list = []       # Will store both the neighbor
                            # and how many neighbors they
                            # have, as a tuple
    for neighbor in vertex.get_connections():
        if neighbor.get_color() == 'white':   # available
            c = 0           # counter for THIS neighbor
            for w in neighbor.get_connections():
                if w.get_color() == 'white':
                    c = c + 1   
            neighbor_list.append((c,neighbor))
    # We now have a list of (count, neighbor) values
    # We want to order them by count, from low to high
    neighbor_list.sort(key=lambda x: x[0])
    # Now, send back a list of the neighbors that have
    # been ordered as needed
    return [neighbor[1] for neighbor in neighbor_list]

How does this affect our search times?

(base) rwhite@Ligne2 knights_tour % python knights_tour.py
Done!
[1, 18, 35, 52, 37, 54, 39, 22, 7, 13, 30, 47, 62, 45, 60, 43, 58, 41, 56, 50, 
33, 48, 42, 59, 44, 61, 55, 38, 28, 11, 26, 32, 49, 34, 24, 9, 3, 20, 5, 15, 21, 
6, 23, 29, 19, 4, 14, 31, 46, 63, 53, 36, 51, 57, 40, 25, 8, 2, 12, 27, 17, 0, 
10, 16]
Elapsed time: 927.4193122386932 seconds
(base) rwhite@Ligne2 knights_tour % python knights_tour_improved.py
Done!
[1, 16, 33, 48, 58, 41, 56, 50, 60, 54, 39, 22, 7, 13, 23, 6, 12, 2, 8, 18, 3, 
9, 24, 34, 40, 57, 51, 61, 55, 38, 44, 59, 49, 32, 17, 0, 10, 27, 42, 25, 19, 4, 
14, 29, 35, 52, 62, 45, 28, 43, 26, 11, 5, 20, 37, 47, 30, 15, 21, 31, 46, 63, 
53, 36]
Elapsed time: 0.00043702125549316406 seconds
(base) rwhite@Ligne2 knights_tour %

7.4.5. Rectangular boards

A related topic of interest is the knight's tour on a rectangular shaped board.

Modify your programs so that they work with a given number of rows and columns, where rows <= cols.

Only certain types of rectangular boards are able to be solved. See if you can identify which boards offer a solution and which don't.

7.5. Generalized Depth-First Search

The Knight's Tour analysis involved creating a graph of connected nodes, and then designing an algorithm that would look for a single, non-branching path through that graph to identify the steps in the tour.

There are other strategies we can use to analyze a graph, including looking at every single connection ("edge") between every single vertex (node) in the graph. This has the potential to create not just a single tree of interconnected nodes, but a "forest" of trees connecting nodes.

For our next applications (following our textbook), we're going to create a new class called DFSGraph which inherits from Graph, and includes a new instance variable and two new methods (in addition to the ones it inherits):

class DFSGraph(Graph):
    """Used to create a depth first forest. Inherits from the
    Graph class, but also adds a `time` variable used to track
    distances along the graph, as well as the two methods below.
    """
    
    def __init__(self):
        super().__init__()
        self.time = 0       # allows us to keep track of times when vertices
                            # are "discovered"
    
    def dfs(self):
        """Keeps track of time (ie. depth) across calls to dfsvisit
        for *all* nodes, not just a single node: we want to make sure
        that all nodes are considered, and that no vertices are left
        out of the forest.
        """
        for aVertex in self:             # iterate over all vertices
            aVertex.set_color('white')   # initial value of unexamined vertex
            aVertex.set_pred(-1)         # no predecessor for first vertex
        for aVertex in self:             # now start working our way through
            if aVertex.get_color() == 'white':   # a depth-first exploration
                self.dfsvisit(aVertex)           # of the vertices
    
    def dfsvisit(self, start_vertex):
        """Effectively uses a stack (by calling itself recursively) to
        explore down through the depth of the graph.
        """
        start_vertex.set_color('gray')          # Gray color indicates that this
                                                # vertex is the one being explored
        self.time += 1                          # Increment the timer
        start_vertex.set_discovery(self.time)   # Record the current time for this 
                                                # vertex's discovery
        for next_vertex in start_vertex.get_connections():  # check all connections
            if next_vertex.get_color() == 'white':          # if we've touched this
                next_vertex.set_pred(start_vertex)          # reset pred to our start
                self.dfsvisit(next_vertex)                  # continue depth search
        start_vertex.set_color('black')         # After exploring all the way down
        self.time += 1                          # Last increment
        start_vertex.set_finish(self.time)      # Stop "timing"

Recall that the Graph class, from which DFSGraph inherits, already has methods that we can use to add vertices and edges to a graph.

Once we've done that, we can call dfs() to create the depth-first relationships for the graph. The dfs() method itself calls the helper method dvsvisit() to establish those relationships.

Use the example in the next section to develop an understanding of how these relationships are developed.

7.5.1. A small graph

Before we apply the DFSGraph to an actual problem, let's see how to set up a small DFS graph.

Write a small program that

Creates a DFSGraph
Adds the vertices/edges from the graph shown here
Calls the dfs() method to create a graph of the vertices
Prints out all the vertices so what we can see how they're connected

#!/usr/bin/env python3
"""
book_example.py
"""

from atds import *

def main():
    # Set up graph
    g = DFSGraph()
    # Add all the edges (vertices are created in method)
    g.add_edge('A','B')
    g.add_edge('A','D')
    g.add_edge('B','D')
    g.add_edge('B','C')
    g.add_edge('D','E')
    g.add_edge('E','B')
    g.add_edge('E','F')
    g.add_edge('F','C')
    # Explore the graph using the depth-first search, establishing
    # vertex relationships in the process
    g.dfs()
    # Print out the resulting vertices in the graph
    for v in g:
        print(v, "\n")

if __name__ == "__main__":
    main()

7.5.2. Making Pancakes (Topological Sorting)

Follow the instructions outlined in our textbook to create a recipe for making pancakes. :)

#!/usr/bin/env python3
"""
pancakes.py
"""

from atds import Vertex, Graph, DFSGraph

def main():
    g = DFSGraph()
    
    # Describe all the edges as a large tuple of tuple-pairs
    edges = (("3/4 cup milk","1 cup mix"),
             ("1 egg", "1 cup mix"),
             ("1 Tbl Oil", "1 cup mix"),
             ("1 cup mix", "heat syrup"),
             ("heat syrup", "eat"),
             ("heat griddle", "pour 1/4 cup"),
             ("1 cup mix", "pour 1/4 cup"),
             ("pour 1/4 cup", "turn when bubbly"),
             ("turn when bubbly", "eat"))
    
    # Put all the edges in the graph
    for edge in edges:
        g.add_edge(edge[0],edge[1])
    
    # Create the forest
    g.dfs()
    
    # Put all the vertices in a list:
    vertex_list = [g.get_vertex(k) for k in g.getVertices()]
    
    # Sort the vertices by finish time to create the recipe list
    vertex_list.sort(key=lambda x: x.getFinish())
    vertex_list.reverse()
    
    # Print recipe steps
    print("Here's how to make pancakes")
    for step in vertex_list:
        print(step.get_id())

    
if __name__ == "__main__":
    main()

Advanced Topics in Computer Science

Unit 7 - Graphs

Table of Contents

7.1. Overview - Graphs

Definition: Graph

7.1.1. Vocabulary

Graph Vocabulary

7.1.2. The `Graph` Abstract Data Type

The `Graph` ADT

7.1.3. Implementing the `Graph` ADT in Python

7.1.3.1. The `Vertex` class

Problem: Draw this Graph

7.1.3.2. The `Graph` class

7.2. Overview - Graph Applications

Developing Graph-based Solutions

Warm-up: Create this Graph

7.3. Breadth-first Search (BFS)

Algorithm: Breadth-First Search

7.3.1. Enhancing the Vertex class

7.3.2. The Word Ladder Game

A Word Ladder

7.3.2. Building Graph of Related Words

Building a Graph of Three-Letter Words

7.3.4. Breadth-First Searching

7.3.5. Implementing the breadth-first search

7.3.5. Traversing the Graph

7.4. Depth First Searching

7.4.1. The Knight's Tour Problem

7.4.2. Creating a graph of the Knight's possible moves

7.4.3. Performing the tour with a depth-first search

7.4.4. Improving the depth-first search with a heuristic

Improving the `neighbor_list`

7.4.5. Rectangular boards

7.5. Generalized Depth-First Search

7.5.1. A small graph

7.5.2. Making Pancakes (Topological Sorting)

Unit 7 - Graphs

Table of Contents

7.1. Overview - Graphs

Definition: Graph

7.1.1. Vocabulary

Graph Vocabulary

7.1.2. The Graph Abstract Data Type

The Graph ADT

7.1.3. Implementing the Graph ADT in Python

7.1.3.1. The Vertex class

Problem: Draw this Graph

7.1.3.2. The Graph class

7.2. Overview - Graph Applications

Developing Graph-based Solutions

Warm-up: Create this Graph

7.3. Breadth-first Search (BFS)

Algorithm: Breadth-First Search

7.3.1. Enhancing the Vertex class

7.3.2. The Word Ladder Game

A Word Ladder

7.3.2. Building Graph of Related Words

Building a Graph of Three-Letter Words

7.3.4. Breadth-First Searching

7.3.5. Implementing the breadth-first search

7.3.5. Traversing the Graph

7.4. Depth First Searching

7.4.1. The Knight's Tour Problem

7.4.2. Creating a graph of the Knight's possible moves

7.4.3. Performing the tour with a depth-first search

7.4.4. Improving the depth-first search with a heuristic

Improving the neighbor_list

7.4.5. Rectangular boards

7.5. Generalized Depth-First Search

7.5.1. A small graph

7.5.2. Making Pancakes (Topological Sorting)

7.1.2. The `Graph` Abstract Data Type

The `Graph` ADT

7.1.3. Implementing the `Graph` ADT in Python

7.1.3.1. The `Vertex` class

7.1.3.2. The `Graph` class

Improving the `neighbor_list`