Advanced Topics in Computer Science

Unit 4 - Linear Data Structures

Table of Contents

  1. Overview: Abstract Data Types, and the Stack
  2. Linear Data Structures: Queue
  3. Linear Data Structures: Deque
  4. Linear Data Structures: Linked List

4.1. Overview - The Stack class

There are lots of different types of linear data structures: collections of data where order is important. When we put an item at the beginning of a list in Python, we expect it to stay at the beginning of the list. (That wouldn't be the case with a dictionary or a set.) Other examples of linear data structures include the stack, the queue, the deque and the linked list.

These linear data structures, which we will soon define, are each an example of an abstract data type (ADT). An ADT is defined in terms of its functions in a "language-agnostic" way. A stack, for example, we'll define first, and then we'll implement the structure in Python. Note that we can also implement an ADT in Java, in JavaScript, in C++, in BASIC, LISP, etc. The implementation is specific to the language, but the abstract data type itself is independent of that.

4.1.1. The Stack

Abstract Data Structure: Stack

A stack is a linear data structure in which items are always added to and removed from the same end. A stack can be visualized as a literal stack of items placed on a table. The first item is placed on the table, then a second item placed on top of the first, then a third placed on top of the second. When the items are removed from the stack, the "top" item is removed first (the third item, or last item placed there), then the second one, and finally the first item.

Because of this behavior, the stack is sometimes referred to as a "last-in, first-out" (LIFO) structure.

There are a number of real life examples that can be considered as a stack.

4.1.2. Formal Definition of the Stack ADT

For us to be able to use a stack, we want to be able to do five things:

  1. Construct a Stack
  2. push an item onto the top of the stack
  3. pop an item from the top of the stack
  4. peek at the top of the stack
  5. identify the size of the stack (the number of items in it)

With these specifications, we'll be creating a Stack class that we can use to manipulate Stack objects.

Here's a more complete abstract specification that we'll use in writing our stack class.

4.1.3. Implementing the Stack ADT in Python

The Stack class in Python

Based on the specification above, write the Stack class in a file called atds.py. Test your class by writing a program stacktester.py that imports the Stack class, then demonstrates the creation and manipulation of stacks.

When you think the atds.py file with your Stack class is ready to be evaluated by the instructor's tester, upload it to the server.

For further information on this assignment, see Activity: The Stack Class [PDF].

Once you've implemented the Stack class in Python, it can be applied to a variety of situations.

4.1.3.1. Checking parentheses

Using parentheses to group mathematical operations into the correct order of calculation might look something like this:

(( a + 3 ) / (( c + 7 / (g + h))) / (q + 1)

As part of evaluating that expression, it will be important to know that the parentheses are balanced: that for every opening parenthesis there is a closing one, and that a parenthesis that closes match with the nearest opening one.

It is easy for us to create legal and illegal sets of parentheses, and people are pretty good at evaluating whether a set of parentheses is "well-formed" by inspection:

( ( ( ) ) )           legal
() () ()              legal
(( )(( )(( ))))       legal
)()()(                illegal
()()(                 illegal
((()()                illegal

How can we get the computer to inspect them and determine whether they form a legal set of parentheses or not?

Problem: Matching parentheses

Write a program called paren_checker that imports the atds module, creates a Stack() object, and has a boolean function is_valid(expr) that identifies whether a string of parentheses is legal or not.

Once you've got your paren_checker working, create a new version of the program called bracket_checker which checks expressions that include square brackets and curly braces as well.

4.2. Overview - The Queue Class

Once you've got the "Last In, First Out" (LIFO) mechanisms of a stack figured out, the queue is a simple variation on that. A queue is a "First In, First Out (FIFO) structure that is familiar to anyone who has ever waited in a line.

Someone shows up first, other people show up later and get in line behind the earlier people. First in, first out.

4.2.1. The Queue as Abstract Data Type

A queue is another linear data type where order matters. As mentioned above, in a queue structure, items that are added to the sequence first are processed before those that come later. We say that items are processed at the "front" of the queue, and new items, as they are added, are placed at the "back" of the queue.

Methods for a Queue class

Based on what we've already done for the Stack class, what methods do we need for an abstract data type Queue?

Show/hide methods

  • The Queue() constructor creates a new, empty, queue.
  • enqueue(item) adds an item to the rear of the queue.
  • dequeue() removes the item from the front of the queue and returns it.
  • peek() identifies the item at the front of the queue but doesn't return it. (Note: this method isn't included in our book.)
  • is_empty() returns True if the list is empty.
  • size() returns the number of items in the queue.

4.2.2. Implementing the Queue ADT in Python

As we consider the Queue class and implementing it in Python, it probably makes sense to use a list for our queue. But where should we place the "front" of the queue in our list: at index 0, or at index [-1] (the last item in the list)?

Let's say that I place the following items into my queue, in this order: "item" , 3 , 2.78 , "Richard", True.

You can get either of these implementations to work Python, but the first one, because of the way that Python works with lists, turns out to be quite a bit more efficient as your Queue increases in size.

You can confirm this by writing two implementations of the Queue and testing them!

The Queue class in Python

Based on the specification above, add the Queue class to your atds.py module. Test your class by writing a program queue_tester.py that creates queues and manipulates them.

When you think your Queue class is ready to be evaluated by the instructor's tester, upload this new atds.py file to the server to replace the one that is already there.

For further information on this assignment, see Activity: The Queue Class [PDF].

4.3. Overview - The Deque Class

The Deque class is a "double-ended Queue," and not to be confused with the word "dequeue." It has the best of both worlds. You can:

Which side should we make the front? Index 0? or Index -1? Does it matter?

4.3.1. Implementing the Deque class

The Deque class in Python

Add a Python implementation of the Deque class into your atds.py module. There is no peek method for this class, but be sure to include .is_empty() and .size() methods.

4.3.2. Testing your Deque class

Get a copy of the atds_tester.py and run it against your atds.py to make sure your classes are working as they should be.

4.4. Overview - Linked Lists

You're already familiar with Python's list data structure. It's a well-established implementation based on an Abstract Data Type called an unordered list. In this section we're going to be considering the details of this Abstract Data Type, and then implementing this ADT using a new class we'll be writing: the Node class.

Attention!

The material that we'll be developing in this next sections will be critical to what we do later on in the course, so make sure you get a good grasp of what we're doing as we go.

There's a place for "fake it till you make it," but this is not it. Make sure you understand these concepts well before moving on.

4.4.1. The Unordered List ADT

Imagine that you have a list of items. Each item on your list comes before and/or after the other items on the list; it's a sequence in that sense. It's "unordered," however, because the values themselves are not in order: not in numeric order, or alphabetic order.

If you want to be able to interact with that list, and do things with the elements of that list, you'll need a number of operations:

4.4.2. What does this unordered list look like?

The unordered list is abstract, and so it doesn't need to look like anything, but if we were to draw it, we could consider it as a series of data values connected by pointers linking them. The head of the list is indicated by a special pointer, and from there, each element in the list points to the next element in the list, until we finally come to the last element in the list.

4.4.3. The Node class

In order to track these different elements and their pointers, we're going to create a new class, the Node class.

A Node will contain two pieces of information—two instance variables:

  1. data = the information contained in the node
  2. next = the pointer to the next Node in the unordered list

This class will also require that we write four methods that will allow us to interact with those variables: two "getters" and two "setters".

4.4.4. Writing the Node class

Based on what we've identified, writing the Node class should be relatively straightforward.

Note that in this class we'll construct a new Node with an initial piece of data, but that there is no initial next that we point to. Make sure to set self.next to None explicitly.

Write the Node class

Based on the description above, write the Node class.

Test your class, either interactively, importing it into a small test program, or by writing a main function written into that same program.

4.4.5. Writing the UnorderedList class

Our List is going to consist of a series of Node objects all linked to each other via the next pointer. The only thing we need is something to point to the very first item—the first Node—and that will be handled by a reference in the UnorderedList class. Because this reference is pointing to the head of the list, we'll call it head.

Here is the start to our class:

from atds import Node

class UnorderedList():
    """Maintains an unordered list via a linked series of Nodes
    """
    
    def __init__(self):
        self.head = None

When we create a new UnorderedList object, we can visualize that initial empty list object in this way:

We can identify that there are no items in this new list because the head has a value of None.

If we want to add an item to the list, the new sketch of what that looks like is given here:

We can see that the head reference no longer has a value of None. It has a reference to a Node object, which has data, and a pointer to None, which is an indication that this node is the last in the list.

Let's see what the add(item) method for the UnorderedList class looks like.

def add(self, new_data):
    """Creates a new Node based on the data, and adds it to the 
    beginning of the list.
    """
    temp_node = Node(new_data)
    temp_node.set_next(self.head)
    self.head = temp_node

Sketch of an UnorderedList

Draw a sketch of what an unordered list called my_data would look like after the following code is executed:

my_data = UnorderedList()
my_data.add(5)
my_data.add(2)
my_data.add(17)

Show/hide sketch

Because of the order that we added the items into the list, and the fact that the .add() method inserts items at the head of the list, you can see that the first item added is last on the list.

4.4.6. Traversing the List

Many of the methods in the UnorderedList class are going to require working through the entire collection of nodes, following from one object to the next, as indicated by the next references.

Let's see how you do that as we write the length method.

def length(self):
    """Traverses the entire length of the UnorderedList to identify 
    how many values (nodes) there are in the list.
    """
    node_count = 0                  # local variable for counting
    current = self.head             # start at the beginning
    while current != None:          # if we're not at the end of the list
        current = current.get_next()     # move to the next node
        node_count += 1                 # increment node counter
    return node_count

Now that we know a little about traversing...

UnorderedList: the search() method

Write the boolean search method, which takes an data value and looks through the list to identify if it's there. If the item exists on the list, return True.

Show/hide solution

def search(self, data):
    """Traverses through the list to determine if the indicated data 
    is present.
    """
    current = self.head
    found = False                   # Assumed at the start
    while current != None and not found:
        if current.get_data() == data:
            found = True
        else:
            current = current.get_next()
    return found

4.4.7. UnorderedList Debugging Tool

Insert this method into your UnorderedList class so that you can print a representation of your list at any time. It will help in trying to identify the current state of your list.

def __repr__(self):
    """Creates a representation of the list suitable for printing,
    debugging.
    """
    result = "UnorderedList["
    next_node = self.head
    while next_node != None:
        result += str(next_node.get_data()) + ","
        next_node = next_node.get_next()
    if result[-1] == ",":
        result = result[:-1] # remove trailing comma
    result = result + "]"
    return result

4.4.8. The remove() method

Perhaps the trickiest method to write (at this point in our course) is the remove(data) method. We need to be able to traverse the linked list, identify the node that has the given data in it, and then take it out of the list.

But how do we "take it out of the list?"

We need to take the pointer from the previous Node—the one that points to our current node—and have that point to the node just after our current node, effectively removing our current node from the list.

So we'll need to be sure that we keep track of that previous node with a variable for that purpose.

Take a look at this code to see how it all works:

def remove(self,data):
    """This tricky method uses a pair of variables to monitor
    the traversal so that we can remove the desired item from the
    list by taking the pointer from the previous Node and setting
    it to the item AFTER the one that we're removing.
    """
    previous = None                         # first item in list has no previous
    current = self.head                     # start at the beginning
    while current != None:              
        if current.get_data() == data:      # we found the item to remove!
            if previous == None:            # if it's the first item on the list...
                self.head = current.get_next()       # just reset head to next node
            else:
                previous.set_next(current.get_next()) 
                                            # reset previous next to this node's next
            return                          # we're done
        else:
            previous = current              # if we haven't found it yet, set previous
            current = current.get_next()    # to current and follow to the next

4.4.9. Other UnorderedList methods

What else needs to be written?

You'll have a much better understanding of linked lists once you've worked through the details of each of these methods.

4.4.10. Additional Exercises

There are a number of modifications that would be appropriate to implement, or interesting to experiment with. Pick two of the following and modify your UnorderedList class as needed.

4.5.0. The Ordered List

Let's consider now an ordered list constructed of nodes. The ordered list is similar to an Unordered list, but all of the data values are in order.

Take a look at this description of the constructor and methods of the OrderedList class.

The OrderedList class

Most of the methods that we've already written for the UnorderedList class will work just fine here.

Which methods will need to be rewritten? How can we build an OrderedList as items are added to it?