LoRA is a parameter-efficient fine-tuning (PEFT) technique that freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture.
Introduced by Microsoft researchers to reduce the memory requirements of fine-tuning.
The most popular method for community fine-tuning of open-weights models like Llama and Stable Diffusion.