Strategies to balance datasets. Undersampling removes examples from the majority class (risking info loss). Oversampling duplicates the minority class (risking overfitting).
Where did the term "Undersampling vs. Oversampling" come from?
Data preprocessing basics.
How is "Undersampling vs. Oversampling" used today?