MoRA: A High-Rank Alternative to LoRA

An original approach to PEFT

May 29, 2024

∙ Paid

A cartoon-style table salt shaker happily walking with two very long, skinny legs, carrying a cube above its head. The salt shaker should have a cheerful face with big, expressive eyes and a wide smile. The legs should be exaggeratedly long and in a dynamic walking pose, with the arms extended upwards holding the cube. The background should be simple and colorful, with a playful and lighthearted feel to the image. — Generated with DALL-E

LoRA is one of the most used parameter-efficient fine-tuning (PEFT) methods for large language models (LLMs). However, there remains a performance gap between LoRA and full fine-tuning. Previous works proposed to improve LoRA in various ways to make it more memory-efficient and accurate: DoRA, LoftQ, VeRA, etc.

In an article for The Kaitchup, I showed that it can be difficult to confirm the potential advantages of these alternatives. For instance, I didn’t see any difference between DoRA and LoRA.

Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters

Benjamin Marie

April 18, 2024

Read full story

We still don’t have a PEFT method clearly matching the performance of full fine-tuning.

MoRA is yet another original method aiming at closing this performance gap. This PEFT method is original as it fine-tunes a high-rank adapter. According to its authors, MoRA outperforms LoRA on several tasks such as continual pre-training and instruct fine-tuning.

In this article, I review MoRA. We will see how MoRA can fine-tune a high-rank adapter on top of an LLM with a number of trainable parameters similar to LoRA. Then, we will try MoRA with a quantized Llama 3 8B, i.e., QMoRA, to check its performance.

I made a notebook showing how to fine-tune Llama 3 with MoRA:

Get the notebook (#6)

The Salt - Curated AI

MoRA: A High-Rank Alternative to LoRA

An original approach to PEFT

Training, Loading, and Merging QDoRA, QLoRA, and LoftQ Adapters

This post is for paid subscribers