Perplexity measures how 'surprised' a model is by a text sequence. Lower perplexity means the model predicts the text with higher confidence.
Fundamental NLP metric.
Good for pre-training checks, bad for chat quality assessment.