Batch Size

What is Batch Size?

The number of training examples used in one iteration of model training. It determines how often the model's weights are updated. Larger batches arguably provide a more stable gradient estimate but require more VRAM.

Where did the term "Batch Size" come from?

Fundamental training parameter.

How is "Batch Size" used today?

The first thing you lower when you get an OOM error.

Related Terms