QLoRA

What is QLoRA?

QLoRA (Quantized LoRA) extends LoRA by quantizing the base model to 4-bit precision while keeping the adapters in higher precision. This drastically reduces memory usage.

Where did the term "QLoRA" come from?

Developed to allow fine-tuning of large models (like 65B parameters) on consumer hardware.

How is "QLoRA" used today?

Democratized access to LLM fine-tuning, allowing it on single logical GPUs.

Related Terms

LoRA (Low-Rank Adaptation)
Fine-tuning
Model Weights