A data augmentation technique that combines 4 training images into one in a grid. This forces the model to learn to detect objects at different scales and contexts simultaneously.
Popularized by YOLOv4.
Standard in modern object detection training.