snippets

🔌 Toolbox of short, reusable pieces of code and knowledge.

View the Project on GitHub rosikand/snippets

Convolutional neural network shape and sizing

One of the tedious parts when working with constructing neural networks is figuring out the right shape and size given your inputs. This note summarizes this math as noted in the CS231N notes. The wikipedia page also has some nice arithmetic.

Convolutional layer

Size-wise, we aim to preserve the dimensions of the input after passing it through a convolutional layer.

and produces a volume of size \(W_{2} \times H_{2} \times D_{2}\)

Choosing hyperparameters: General rule of thumb: \(F=3, S=1, P=1\) To preserve input size, ensure that $P=(F-1) / 2$.

Pooling layer

In CNN architectures, we usually insert a few pooling layers in between the convolutional layers. These are responsib le for reducing the spatial size of the input, therefore decreasing the number of parameters and preventing overfitting. As the CS 231N notes put it: “pool layers are in charge of downsampling the spatial dimensions of the input.”.

and produces a volume of size \(W_{2} \times H_{2} \times D_{2}\)

General rule of thumb: $F=2, S=2$.

Note: there are many types of pool layers. It is common to use MaxPool2d.

Activation functions

Activation functions connect the output of a previous layer to the input of the next. See here.

In general, there are two main layers of a neural network where an AF might be applied: hidden layers and output layers.

Machine learning mastery has a nice diagram for deciding:

image

Common architecture pattern

INPUT -> [[CONV -> RELU]*N -> POOL?]*M -> [FC -> RELU]*K -> FC 

Dropout

It has become increasingly common to “drop out” a few neurons between layers to reduce overfitting (regularization). Such layers are called dropout layers.

A dropout layer does not change the input size.