QLoRA (Quantized LoRA) extends LoRA by quantizing the base model to 4-bit precision while keeping the adapters in higher precision. This drastically reduces memory usage.
Developed to allow fine-tuning of large models (like 65B parameters) on consumer hardware.
Democratized access to LLM fine-tuning, allowing it on single logical GPUs.