Build your own “not hotdog”
deep learning model

Introductions

I’m Brendan.

I was on the data team at Etsy. I’m currently a software developer at 18F.

I also like my fair share of raunchy TV comedies.

Goals & Expectations

Image Classification

Input: image
Output: a label (i.e., cat, dog, hotdog) or probabilities across labels

Why it’s tricky for computers

Supervised learning

We give an algorithm a dataset that includes the right answers (a training set) and it learns a function that approximates the data.

After that, it can make predictions on new (but similar) data that it’s never seen.

What’s the right tool?

Convolutional Neural Networks 💥

But first, let’s go over neural nets...

Neural networks primer


  def unit(inputs, weights, bias):
      return activation_function(np.dot(inputs, weights) + bias)
        

Neural networks primer

Neural networks primer

Neural networks primer

Convolutional neural networks

aka ConvNets aka CNNs

A powerful hammer for computer vision nails

Very similar to ordinary neural nets but with an architecture better suited to handle image inputs

To a computer, an image is just a bunch of numbers

Convolutional neural networks

3 main pieces:

  • convolutional layers
  • pooling layers
  • a final fully-connected softmax layer

Convolutional layer

We take small filters and slide them over the image spatially. Different filters respond to different things in the image. Some may like edges, others may prefer yellow regions, etc.

Convolution math

Learned convolution features

Pooling layer

ConvNet, all together

Keras

A high-level neural network library, written in Python

Wraps an API similar to scikit-learn around the Theano or TensorFlow backend.

Modular and user friendly -- easy to construct new models and/or leverage pretrained ones

Pretrained CNN

VGG16 -- a 16 layer ConvNet trained on ImageNet data, ~140M parameters, runner-up in 2014


vgg = keras.applications.VGG16(weights="imagenet", include_top=True)

img = get_image("kitty.jpg")
predictions = vgg.predict(img)

for p in decode_predictions(predictions)[0]:
    print("✨ {} (prob = {:0.3f})".format(p[1], p[2]))
            

✨ Egyptian_cat (prob = 0.213)
✨ kit_fox (prob = 0.213)
✨ tabby (prob = 0.138)
✨ red_fox (prob = 0.107)
✨ tiger_cat (prob = 0.098)
            

Transfer learning

In general, this refers to the process of leveraging the knowledge learned in one model for the training of another model.

More specifically, we’ll load a state-of-the-art model (VGG16), sub out the last classifier layer, and use the rest of the ConvNet as a fixed feature extractor for our new dataset.

Transfer learning


# new softmax layer with number of classes in our dataset
new_classification_layer = Dense(num_classes, activation="softmax")

# connect new layer to the second to last layer in VGG, and make ref to it
out = new_classification_layer(vgg.layers[-2].output)

# create new network between VGG’s input layer and (new) output
model_new = Model(vgg.input, out)

# make all layers (except last) untrainable by freezing weights
for layer in model_new.layers[:-1]:
    layer.trainable = False

# ensure the last layer is trainable (not frozen)
model_new.layers[-1].trainable = True

# compile and fit model
model_new.compile(loss="categorical_crossentropy", optimizer="adadelta", metrics=["accuracy"])
model_new.fit(x_train, y_train, batch_size=128, epochs=50, validation_data=(x_val, y_val))
        

Jupyter notebook »

Additional resources

Stanford’s Convolutional Neural Networks for Visual Recognition course notes

Practical Deep Learning For Coders (by fast.ai)

Kera’s blog, specifically this post

Machine Learning for Artists

One day or day one...

#1 suggestion (if I may) -- get your hands dirty with a toy project and fill in the gaps in your knowledge along the way.

The secret of getting ahead is getting started.

Mark Twain

Thanks!

Code / Slides / Me

thanks!