-
Applications
-
Images
- Image classification
- Image semantic segmentation
- Image retrieval
- Object detection
-
Language
- Text classification
-
Software mining
- Software flaw detection
-
History
- Receptive field in a single neuron, 1959
- Neocognitron, about 1980
- Gradient-based CNN for hand-written character recognition, 1998
- AlexNet in ImageNet competition, 2012
-
Hierarchical structure
-
Physical components
- Layers of neurons
- Learnable weights
- Learnable biases
-
Input: raw data
-
Types
- RGB images
- Raw audio data
-
Properties
-
3D tensor
- H-row
- W-height
- 3-channels
-
Feed-forward
-
ConvNet layers
-
Convolutional layer
- Objective: Convolve the filters with the input
- Physical components
- Input (W*W*depth): 3D volume
- Parameters: a set of learnable 3D filters/kernels
- Output: 3D volume
- General size: [ (W-F+2P)/S+1 ] * depth_out
- Unchanged size: P=(F-1)/2 & S=1
- Jobs
- Local connectivity
- Receptive field
- w.r.t one neuron
- 3D: width*height*depth (=input.depth)
- Filter size (F) = receptive field size
- Entries in the filter + 1 bias = params for one neuron
- Spatial arrangement
- Depth: number of filters
- Stride (S): how we slide the filter
- Zero-padding (P): pad the input volume with zeros around the border
- Parameter sharing
- Objective: reduce the number of params
- depth slice
- Same weights and bias sharing per depth slice
-
Pooling layer
- Objectives: Pregressively reduce the spatial size of the representation
- Reduce the amound of params
- Reduce the computation in the network
- Control overfitting
- Job
- Accept a volume of size (W1 * H1 * D1)
- Spatial extent: F
- Stride: S
- Produce a volume of size (W2 * H2 * D2)
- W2 = (W1-F)/S+1
- H2 = (H1-F)/S+1
- D2=D1
- Types
- (Common) Max pooling
- (Common) F=2, S=2
- F=3, S=2, overlapping pooling
- Average pooling
- L2-norm pooling
- Features
- No additional params introduced
- Not common to pad the input with zero-padding
- Gradient routing is efficient in BP
- Last layer (FCN): objective function
- Non-linear activation function
- ConvNet architectures
- Backpropogation
-
Optimizing methods
- Gradient descent
- Batch gradient descent
- Stochastic gradient descent
- (Common) Mini-batch SGD
- Momentum
- RMSProp
- Adam
- Hyperparameters