Unit 11: Searching
Topics covered in this unit
By the time you are done with this unit you will:
- be able to describe and implement a linear search on an Array or ArrayList of values.
- be able to describe and implement a binary search on an Array or ArrayList of values.
- be able to consider a binary search in terms of a recursive solution.
- be able to explain the Big-O performance of the linear search and binary search algorithms.
11.0. Overview - Searching
Sorting algorithms are a classic topic in computer science... and so are searching algorithms. We're only going to examine two: the Linear Search and the Binary Search.
Let's get started!
- Searching
- Linear Search
- Binary Search
- Implementing non-recursive Binary Search
- Recursive Binary Search
11.1. Searching
Information is often stored in some sort of data structure: an Array, an ArrayList, a Collection, a text-based "flat-file," a relational database, a hash table, a linked list...
The greater the quantity of information that needs to be looked through, the more important it is to implement a searching mechanism that can easily identify the specific information you want to access.
For this class, we're only going to look at two searching strategies: the linear search and the binary search.
11.2. Linear Search
A linear search consists of looking through the list of items one at a time until we've either
- found the item that we were looking for, or
- gone through the entire list and discovered that it's not on our list
LinearSearch
Write a class LinearSearch that takes an integer array as a parameter when the LinearSearch object is constructed. The class should include an int method .find() that looks for a value n, and returns the first location of that value in the array.
Then write a LinearSearchDemo that uses ArrayUtility to create an random array of values, asks the user to enter a search value, and then prints the result of the search.
Implementation question: What should we return if the value is not found on our list?
11.3. Binary Search
A linear search is just as good as any other search strategy if your data hasn't been organized or indexed in some fashion. But what if your data is organized? What if we have an array of values that have already been sorted, and we want to look up an entry to find its position in the array?
A number between 1 and 10
"I'm thinking of a number between 1 and 10. Guess what the number is and I'll tell you to guess 'higher' or 'lower' until you guess the number."
How would you attempt to minimize the number of guesses you have to make?
If we're trying to guess a number between 1 and 10, it makes sense to start in the middle. That way if we don't guess the correct result the first time, we'll get a hint about which values we can eliminate from our future guessing—we'll remove half the numbers every time we guess!
Follow-up question: If I had to guess a number between 1 and 4, how many guesses would I need to have to be sure of getting the right number? and if I wanted to keep the guessing game interesting for someone else—giving them just enough guesses so that they might not get it each game—how many guess should I give them? What if someone has to guess a number between 1 and 8? How many guesses to guarantee they can get the answer? How many guesses to keep it interesting?
For a number between 1 and 4, three guesses are needed to guarantee, with two guesses to "keep it interesting." For a number between 1 and 8, four guesses are needed to guarantee, three guesses to keep it interesting.
By the way: log2 4 = 2, and log2 8 = 3. There are some interesting patterns here!
... and that's exactly how a Binary Search works:
Binary Search
A binary search locates a value in a sorted array by locating the element in the middle of the array and checking to see if that is the value being searched for. If so, we report the index and we're done. Otherwise, if the value at the middle is too high, we focus on looking through the lower half of the array, and if the value at the middle is too high, we focus on looking through the upper half of the array. Take this new subset of the original array as a new array that can be searched using the same strategy: look at the middle element, and if needed refocus on either the lower or upper subsets of the array. This process repeats—look at a segment of the array, and split it in two to perform a sub-search—until the position of the value is identified, or it has been determined that the value is not in the array.
Performing a binary search is so easy to understand that we play it as the "Guess a number between 1 and 10" game.
Let's see how we can code a binary search.
11.4. Implementing Binary Search - non-recursive
Write a class BinarySearch.java
- The BinarySearch.java class takes a sorted array of int values in its constructor. The int method .search(int n) looks for the value n in the array by performing a binary search. If the value of n is found, the index is returned. If n is not in the list, a value of -1 is returned.
- Although it won't be necessary for the final version of your program, it may be helpful during debugging to print out the values of various instance variables as you look through the array. Feel free to use numerous println() statements to assist you.
- Start your debugging with a 5-element array:
int[] a = {2, 4, 6, 7, 9}
Test your search algorithm on it trying to find each of these elements:
6 --> Expected result: 2 4 --> Expected result: 1 8 --> Expected result: -1 0 --> Expected result: -1 10 --> Expected result: -1 9 --> Expected result: 4 - Then set up a 4-element array and see if it still works:
int[] a = {2, 5, 7, 9} 5 --> Expected result: 1 2 --> Expected result: 0 9 --> Expected result: 3 10 --> Expected result: -1 4 --> Expected result: -1 1 --> Expected result: -1
If you're a little unsure of where to start with the Search algorithm, check out this pseudocode:
11.5. Recursive Binary Search
A recursive strategy is a natural for looking for a value in an array.
Recursive Binary Searching
The Binary Search is an ideal context for trying recursion: we want to look for a value in a range of values, and if we don't find it in the middle of that range, we readjust our scope, and look for a value in a smaller range of values, and if it's not in the middle, we readjust our scope, and... It's recursion! We just need to make sure we stop if we haven't found the result by the time the scope of our search space has been eliminated.
Here's some pseudocode for a static search method. See if you can get it to work.