Not Hotdog App with fast.ai

A simple way to build a powerful deep learning image classifier

Posted by Michael G. on August 14, 2019

Responsive image

In this post we will be using fast.ai to recreate the "Not Hotdog" app featured in the tv show "Silicon Valley". The fastai library simplifies training fast and accurate neural nets using modern best practices.

The code for this project can be found here.

Sections:


Imports

Responsive image

fasai handles all the dependencies for you. You simply call from fastai.vision import * and that is it. I also imported a library called google_images_download which I will use next to download our images from google which will be used to train our model.


Data Preparation

Responsive image

Responsive image

Responsive image

For our Not Hotdog app to work we are going to need to train it with images of "hotdogs" and "not hotdogs". To get the images we use the library googleimagesdownload . The json file above are parameters for the library. We are essentially telling it to Query Google Images once for "hotdog" and download the first 100 images into a folder called "images/hotdog" and then again for "random pictures" and store them in a folder called "images/not_hotdog".

We execute the code by calling ! googleimagesdownload -cf google_config.json We now have the images we need for our deep learning model.

Responsive image

fastai uses a Learner object to train a model. In order to create a learner with your data you first need to create a ImageDataBunch object.

Above we:
  • Set the path for the location of the image folders
  • Provide the names of the two folders we are labeling
  • Create a validation set from our images using 20% of them as validation
  • Transform the Images
  • Set the size of the Images
  • Normalize the Data

Responsive image

Responsive image

Now that we have our databunch created data we can preview our transformed / normalized images. We can also access the Labels that we are trying to predict as well as the train and validation sizes.


Training

Responsive image

To train our model we create a cnn_learner which specializes in image labeling. We pass in the databunch we created earlier data and then use the default arguments found in the documentation.

We then fit the model using learn.fit_one_cycle(4) . As you can see each time the dataset is trained (epoch) the model performance get slightly better.

Finally we save the current iteration of the model just in case something goes wrong and we need to revert back to a previous state.

Responsive image

The next step is to optimize our model using the learning rate finder. The learning rate is what tells our model how fast or how slow it should minimize loss. A learning rate too high might never find the minimum loss and too low and it could take too long.

We call learn.lr_find() and then learn.recorder.plot() . According to the documentation we are supposed to pick out a range on the plot that has the steepest slope. I'm still a little unsure about this but I opt for the range of "1e-05 to 1e-04".

We then fit the model one last time using the new learning rate range.


Evaluation

Responsive image

Now that our model is sufficiently trained let's check the results of the validation set with a Confusion Matrix. As you can see below we have 2 wrong predictions where our model thought the picture was a "hotdog" when in fact it was a "not_hotdog".

Responsive image

By calling interp.plot_top_losses(2) we can see the two images that were incorrectly labeled. As you can see both images somewhat resemble the shape of a hotdog.

Responsive image\

Responsive image

The last step in the process is using our model to actually predict new images. Above I import 3 new images using open_image . We then use learn.predict on our images which returns the class we are predicting, its index, and the probability. We are only interested in the class so we put an underscore "_" for the other two.

As you can see it accurately identifies the hotdog as a "hotdog" and the dog as a "not_hotdog". Unfortunately, the hamburger is labeled incorrectly but as you can see it does sort of resemble a hotdog. The next steps would be to continue to iterate on the model by adding more images and/or tuning parameters.


Summary

fastai makes it very easy for anyone that wants to be a practitioner of deep learning. It allows the user to create very accurate models that a few years ago would have been considered state of the art. The fastai library is very well documented and has plenty of examples to follow along with. I highly recommend you check it out.