-
Head
-
one-stage
-
anchor-based
-
(2016) YOLO V2~
- [[tx,ty,tw,th,to ]+ C] * K
-
(2016) SSD
- Anchor boxes on multi-scale feature maps
-
(2017) RetinaNet
- Focal Loss + FPN
-
anchor-free
-
center - based
- (2015) YOLOV 1
- [[x,y,w,h,o]*B + C ]* cell_number
- (2019) FCOS
- [dx,dy,dw,dh]+ [C,centerness]
-
key point-based
- (2019) CenterNet
- (2018) CornerNet
-
two-stage
- (2015) FRCNN (anchor-based)
- (2019) RepPoints (anchor-free)
-
Neck
-
(2016) FPN
- multi-scale feature fusion
- divide-and-conquer
-
(2018) PANet
- all scales matter for objects of different sizes
-
(2019) NAS-FPN, BiFPN
- repeated feature fusion
-
(2021) YOLOF
- Uniform Matching
- dilated convolutions
-
Backbone
-
ResNet, ResNeXt
- first choice as baseline
-
EfficientNet
- model scaling
-
HRNet
- High resolutioin for localization tasks
-
MobileNet, ShuffleNet
- light weight for Mobile device
-
Details
-
Normalization
-
Batch Norm.
- CmBN
- Frozen BN
- SyncBN
- NFNet : Replace BN
-
Layer Norm.
- feature direction
- remove dependency on batches
-
Group Norm.
- performs better than layer Norm.
- Weight Standardization
-
Activation
-
ReLU
- Swish,Leaky ReLU, SiLU, GELU
-
Convolution operators
-
dilated convolution
- enlarge receptive field and remain feature map size
- depth-wise separable convolution
- 1*1 convolution
- deconvolution
- stride
- padding
- speed-accuracy trade-off
-
Bag of Specials
-
attention mechanism
-
channel-wise
- Squeeze and Excitation
-
point-wise
- Spatial Attention Module
-
feature integration
-
addition
- skip-connection
- concatation
-
enlarge receptive field
- dilated convolutions
- down sampling
- large kernel
-
Bag of Freebies
- style transfer
- MixUp, CutMix
- photometric distortion
- geometric distortion
- DropOut/ DropBlock
- label smoothing
- training tricks
- Augmentation
- Regularization
-
Multi-task Loss
-
classification
-
CE
-
BCE (+sigmoid)
- H(P, Q) = – (P(class0) * log(Q(class0)) + P(class1) * log(Q(class1)))
-
softmax+NLL
- 不适用于开放集 AMSoftMax
- KL-divergence
-
Focal Loss
- GFL
-
seesaw Loss
- Subtopic 1
- class Imbalance
-
localization
- MSE, SSE
- IoU ( GIoU, DIoU,CIoU )
-
combination
- (weighted) sum
- generalized Focal Loss
-
Positive/Negative samples
- ATSS