AWQ (Activation-aware Weight Quantization)

What is AWQ (Activation-aware Weight Quantization)?

AWQ protects the most important weights (those with high activation magnitudes) from quantization errors, preserving model accuracy better than standard methods.

Where did the term "AWQ (Activation-aware Weight Quantization)" come from?

Recent advance in quantization research.

How is "AWQ (Activation-aware Weight Quantization)" used today?

Widely supported in serving engines like vLLM.

Related Terms