Unit 11: Sorting
Topics covered in this unit
After completing this unit you will:
- be able to explain the value of having items arranged in some order ("sorted")
- be able to describe the Selection Sort algorithm and write a Java program to implement it
- be able to describe the time-performance of simple algorithms, and classify them according to Big-O characteristics
- be able to identify the Big-O characteristic of an algorithm based on a graph of its time-versus-size performance
- be able to describe and implement the Insertion Sort algorithm
- be able to describe and implement the Merge Sort algorithm
11.0. Overview
Sorting a list of items and searching for an item in a list are classic computer science problems. Most high-level languages have sorting/searching packages available for you to import, but it's still important for us to learn about these algorithms. The programming skills you develop in writing these algorithms, and the advantages and weaknesses of different algorithms, are critical computer science skills. Some software engineering job interviews even rely on an applicant being able to implement something like these algorithms, so... this is the real deal.
Someday, you may look back on this moment and say, "Yup. That's when I really started doing computer science."
Let's get started.
- Intro to Sorting: Selection Sort
- Algorithm Efficiency: Big-O Notation
- Other Sorting Algorithms: Insertion Sort, Merge Sort
11.1.0. An Intro to Sorting - SelectionSort
We're going to start our study of sorting with one of the easiest to understand sorts that there is, the Selection Sort.
The Selection Sort is just one of many algorithmic strategies for sorting a list of items. The general strategy is:
- Begin with an array of unsorted values.
- Go through the entire list of items starting at the first item (index 0) to find the position of the smallest item in the array.
- Exchange the smallest item from its position and the item at index 0. At this point, you know that the smallest item is at the start of the list.
- Move to the next item on the list (index 1), and repeat the process of looking for the location of the smallest item on the remaining list.
- Exchange the smallest item from its position and the item at index 1. Now the first two items on the whole list are in the correct order.
- Repeat this process with the third value (at index 2), and so on until the entire list has been examined and sorted.
Let's see how that process works with a list of actual numbers.
Selection Sort Example
Consider this list of numbers:
List to sort: 11 9 17 5 12
Take a moment to perform a Selection Sort by hand on this list of numbers. Go through the list, write down important values, and write a new list underneath the old one each time values in the list are changed.
Then watch this video of the Selection Sort being performed, to verify your process.
In just a moment we'll write a SelectionSortDemo class that will implement this algorithm for us. First, though, we need an array of values to be sorted.
11.1.1. Creating an array of values to be sorted
When you are first debugging your sorting algorithm you'll probably want to work with a consistent, known, array of values.
This is a nice, small, mixed up collection of values that will work for lots of different tests.
When your algorithm has been debugged, you might like to send it a larger array of random values so that we can test other aspects of its performance. For those cases, we're going to need a steady supply of arrays that we can work with, so we might as well take a few moments to create a utility that will allow us to do that automatically. Just as the Math class allows us to use a variety of static methods as we need them, let's write a short static int[] randomArray method that we can use to generate arrays filled with random numbers.
Write the randomArray method
Take a moment to write the randomArray method as outlined below.
11.1.2. The SelectionSortDemo class
In creating the class that's going to implement our Selection Sort, think about what kinds of operations will need to be performed. What was required of us when we went through the small Selection Sort example above?
We needed to:
- traverse through the list, starting at a different beginning point each time
- identify the position (index) of the smallest value in the unsorted tail end of the list
- swap the values between the beginning of the tail and the position of the smallest value
Write SelectionSortDemo.java
Write a program SelectionSortDemo.java that demonstrates sorting with a selection sort algorithm.
11.2. Algorithm Efficiency: An intro to O-notation
We want to give some thought to how efficient our algorithms are. We can do that in a couple of ways: by analyzing the algorithm's operation, and by measuring the algorithm's performance. Let's start out by taking some measurements.
11.2.1. Timing the Selection Sort Algorithm
To be able to time our computer performing a series of operations, we're going to need to keep track of how much time elapses while those operations take place. To do this, we'll be using time information accessible to us via the System class.
You may have used a stopwatch on your phone, or to time track and field events. The buttons on a typical real-life stopwatch typically include:
- a Start/Stop button
- a Reset button
- a "Lap" button that displays the current time, even as the stopwatch continues running in the background.
We'll use the Java call System.currentTimeMillis() to get a value for the current time, to the nearest millisecond (approximately). (If we were measuring events that take even less time we could get down to the nanosecond, although there are issues associated with using a time that precise.)
What methods do you think you might need to create this StopwatchUtility class?
The StopwatchUtility class
Write a StopwatchUtility class that includes a startStop method, a getTimeElapsed method, and a reset method. Think about what each of those behaviors means for the time that we're recording.
Let's run this program as we sort arrays of various sizes and see what the results looks like.
11.2.2. Measuring the Performance of the Selection Sort Algorithm
Profiling Selection Sort
Take a few moments to collect some Selection Sort data using arrays of various sizes. Record both the size of the array and the number of milliseconds it took to sort the array.
As you're doing this, use a Stopwatch object in the SelectionSortRunner to record how much time passes only for the sorting process. We don't want to time how long it takes to display values, get input, or anything else that's not directly related to our sorting algorithm.
Once you've collected a range of data, use graphing software (Google Sheets?) to display the results, plotting "time to sort (s)" versus "array size (number of values sorted)".
Perform an analytic regression on the data. Does your data look linear? logarithmic? exponential? quadratic? some other power relationship?
11.2.3. Overview - O-Notation
We've looked at the Selection Sort and perhaps collected some data in regards to how long it takes to perform sorts of a certain size. In this lesson we'll graph that data, do a more theoretical analysis of what's going on, and learn to express that analysis in O-notation form.
Then we'll move into implementing an Insertion sort, and see what kind of performance that sorting operation yields for us.
11.2.4. Graphing the Selection Sort Algorithm
I collected some sorting time data for arrays of different sizes using a Selection Sort algorithm, and here's what the data looked like.
Obviously the time that it took to sort values increased as the number of values to be sorted got bigger, but how did that time increase? Did it increase linearly, where a doubling in the number of values would cause a corresponding doubling of the sort time?
One of the interesting things about this graph is that it follows a very precise pattern, a pattern that doesn't depend on how fast my computer's processor is or whether I'm running it on a Windows, OS X, or Linux machine. The shape of this graph is a function of the algorithm that we're using.
We're going to continue to collect timing data on our algorithms, but first, let's develop a more abstract analysis approach.
11.2.5. Analyzing the Efficiency of the Selection Sort Algorithm
Now that we've already seen what the efficiency of this algorithm looks like, lets think about a more theoretical way of considering its efficiency.
Here, we're not worried about clock time anymore. That varies depending on the machine, the language, the CPU, the program... We're going to base our theoretical analysis on the instructions performed, and the most intensive instruction in this program is accessing an array element.
So, how many times do we access an array element in our sorting?
Number of visits in Selection Sort
How many times do we access an array element in Selection Sort?
- The size of the array = n, and going through the array to find the smallest element = n accesses.
- Swapping the two elements—[smallestLoc] and [i]—costs us 2 accesses.
- Now we have one less element to visit in the whole array = n-1 accesses.
- Swap two elements = 2 accesses...
So....
For the first part:
Multiply out to get n2/2 + 5n/2 - 3 = n2 equation (other terms pale in comparison).
So this explains why our equation of data collected looks parabolic (quadratic) in shape.)
11.2.6. Big-O notation
So it looks like the biggest factor in our search times is based on n2/2. So if we start with 1000 items in the array and double it to 2000 items, how does the search space increase?
Increase the size of the array by two, and the time to run the algorithm increases by a factor of four. The number of cells access increases with the square of the size of the array. We say the "The number of visits is of the order n-squared."
Big-O notation describes this relationship as O(n2).
Big-O Notation
Big-O notation is a system of describing the efficiency of an algorithm as its workspace increases in size.
f(n) = O(g(n)) states that the function f (our memory accesses in this case), increases no faster than the function g (the size of our workspace n, squared, in this case).
We've ignored some of the finer details in arriving at this value, but we're looking for a general analysis of the algorithm at this point, so we typically only examine the fastest-growing term in our analysis and ignore all the others.
11.3.0. Other Sorting Algorithms
There are two other sorting algorithms that we want to have some experience with.
11.3.1. Insertion Sort
Insertion Sort
Implement a class InsertionSortDemo, which demonstrates the use of the Insertion Sort algorithm:
- Begin with an array of n values a[0], a[1], a[2], ... a[n - 1]
- We'll consider the elements to the right to be unsorted (the "tail" of the array), and a growing number of elements to the left to be sorted
- The first element i = 0 in the array can be considered to be sorted: it's the first element on the left, and all by itself can be sorted.
- Proceed to i + 1, and look at a[i + 1]. Take that value and work backwards toward the left end of the array, and insert it into the correct position in the sorted "head" of the array. (Ie., does it go before a[0] or after?) As you go, move all the other elements of the array to the right to make room for this number.
- Get the next number in the array, examine it, and work backwards again to insert it into its correct position in the sorted array, being sure to shift all the other elements down again.
- Proceed until the entire array of elements has been inserted into their correct position in the array.
One of the interesting challenges in the InsertionSort is how to find where the element to be inserted should go, and how to move all the elements in order to make room for the insertion.
Once you've completed this algorithm, use the Stopwatch class to collect some time data on it with different size arrays. What kind of Big O-value does this algorithm have?
Insertion Sort Code
Show/hide InsertionSort statistics
Insertion Sort Statistics
You can see from the data here that the Insertion Sort also works in O(n2) time. This algorithm has the same efficiency that the Selection Sort algorithm has.
11.3.2. A review of Big-O Notation
Recall that we've already looked at Big-O notation, which describes how the time it takes to complete an operation varies with the size of the workspace.
An O(n2) Selection Sort that takes 10 seconds to sort 1000 items will take a greater amount of time to sort 3000 items. How much more time? 3000 items is 3 times as much to sort, so the Selection Sort will take 3-squared, or nine times the original 10 seconds: 90 seconds.
Take a look at this data collected on three different sorting implementations, one Selection Sort, and two Insertion Sorts.
Trick Question
Question: Which one of the sorting algorithms shown in the graph above is the least efficient in terms of its O() value?
Answer: Look at the regressions on those curves and you will see that, to the nearest approximation, they are all O(n2); they have the same efficiency. Each curve expresses the fact the time to sort an set of values increases as the square of the size of those values.
Keep in mind that O() notation is not about how much absolute time it takes a process to happen; it's about how the time to do something increases when the workspace changes.
11.3.3. Merge Sort
The Merge Sort is an ingenious little algorithm that works on a very simple principle: sorting two piles of stuff is easier than sorting one big pile.
MergeSort example
Take a look at this example of using a Merge Sort.
Although Merge Sorts can be set up using a loop, there's a logic and an appeal it using a recursive solution to this sorting problem.
Here's a pseudocode recursive MergeSort:
Pseudocode: MergeSort (using recursion)
When you write a solution this way, sometimes it almost feels like cheating. The only hard part is writing a merge() method that will combine the results of the two arrays, but we should be able to handle that. Let's see.
11.3.4. Implementing Merge Sort
MergeSort
Beginning with this template, implement a MergeSort.
You can use your same ArrayUtility class to set up a random array when you're ready to do serious testing, but you might want to begin your testing with a small array that you keep consistent over multiple tests. That will help you to diagnose any issues that might arise in your algorithm.
You can also use the same Runner class that you used for the other sorters—just modify it so that it is creating a MergeSorter class.
11.3.5. How Efficient is the Merge Sort?
We can certainly use a Timer to check out how speedy our MergeSort is.
Merge Sort is O(n log n)
What does it mean to say that the MergeSort is an O(n log n) algorithm?
11.3.6. Sorting using Java's Arrays utility
The Arrays utility has its own sorting routine which uses the Quicksort algorithm.
While implementing a sorting algorithm is a perfectly appropriate exercise for computer science students, in almost any actual coding context you would make use of the libraries that are available to you. Feel free to use Arrays.sort() in your work, unless specifically asked to implement a sorting algorithm.
11.3.7. The Sounds of Sorting
I could watch this six-minute audiovisual representation of various sorting algorithms for days.
Fascinating!