Using a Neural Network - Bird or Not?
I. Assembling some tools
Using a web-based application to classify images of fish is fine for demonstrating the principles of classification, but... what kind of software do scientists and researchers actually use to analyze data?
They use a library of programs, usually based on Python, with names like TensorFlow, PyTorch, and SciKitLearn. For our introduction here, we're going to be using a course put out by fastai that's built on the PyTorch library. As an instructional interface we'll be running code in a Jupyter Notebook hosted at kaggle.com
We'll follow along with some of the instructions in the first lesson.
Let's get started!
II. Do this
- Look at fast.ai's Lesson 1.
There's some great introductory text here, and a long introductory YouTube video that will help orient you if you get serious about taking this course. - Go to kaggle.com.
You'll want to create a free account (use your Poly email for the account), and go the Settings page to verify your account by entering your phone number. This is required for you to be able to access the GPU. - Launch the Kaggle Jupyter notebook for Lesson 1.
Note that this notebook can't be edited or even run by you until you click on the "Edit My Copy" button in the upper right of the screen. This will make a copy of the project in your Kaggle account (like copying a Google Doc that you don't have edit permission for) so that you can run it and edit it as you wish. - Proceed through the notebook one step at a time, reading the text cells, and running the code cells in order. You can execute a code cell by clicking on the numbered
[]
for each block of code to activate it, and then clicking on the "Play" triangle that appears. Some cells will produce output, some won't. - Under "Notebook Options," enable Internet access
- Unpin the environment version, and select latest version
- Modify cell 4 to be the following:
# from duckduckgo_search import ddg_images from duckduckgo_search import DDGS from fastcore.all import * ddgs = DDGS() def search_images(term, max_images=30): print(f"Searching for '{term}'") # return L(ddg_images(term, max_results=max_images)).itemgot('image') return L(ddgs.images(keywords=term, max_results=max_images)).itemgot('image')
Mods to get your notebook to run!
You'll need to make some modifications for your notebook to run correctly.
By the time you've been through the notebook, you won't understand everything that has happened, and you certainly won't understand every command the notebook had in it. But you'll have seen some of the mechanics of it all, and perhaps you even had a chance to try classifying something other than a bird.
Homework
Modify your Kaggle workbook to see if you can get the model to classify other types of items. Coins? Animals? Conifers versus deciduous trees?