Undersampling vs. Oversampling

What is Undersampling vs. Oversampling?

Techniques used to handle imbalanced datasets (where one class is much more frequent than others). Undersampling reduces the number of examples in the majority class, while oversampling increases the number of examples in the minority class (often by duplication or synthesizing new examples).

Where did the term "Undersampling vs. Oversampling" come from?

Standard practices in data preprocessing for classification tasks.

How is "Undersampling vs. Oversampling" used today?

Essential for applications like fraud detection, medical diagnosis, and rare event prediction.

Related Terms