Share this postThe Salt - Curated AIRelated ArticlesCopy linkFacebookEmailNotesMoreRelated ArticlesBenjamin MarieOct 21, 2024Share this postThe Salt - Curated AIRelated ArticlesCopy linkFacebookEmailNotesMoreShareA selection of related articles published in my other newsletter, The Kaitchup:Fast Speculative Decoding with Llama 3.2 and vLLMBenjamin Marie·Oct 14Read full storyTrain and Serve an AI Chatbot Based on Llama 3.2Benjamin Marie·Oct 17Read full storyQLoRA with AutoRound: Cheaper and Better LLM Fine-tuning on Your GPUBenjamin Marie·Aug 19Read full storyRun Llama 3.1 70B Instruct on Your GPU with ExLlamaV2 (2.2, 2.5, 3.0, and 4.0-bit)Benjamin Marie·Aug 29Read full storyHow to Set Up a PEFT LoraConfigBenjamin Marie·Sep 27Read full storyDoRA vs. LoRA: Better and Faster than LoRA?Benjamin Marie·Mar 11Read full storyGoogle's Gemma: Fine-tuning, Quantization, and Inference on Your ComputerBenjamin Marie·Feb 26Read full storyRun Llama 3 70B on Your GPU with ExLlamaV2Benjamin Marie·May 6Read full storySqueezeLLM: Better 3-bit and 4-bit Quantization for Large Language ModelsBenjamin Marie·Feb 12Read full storyQLoRA: Fine-Tune a Large Language Model on Your GPUBenjamin Marie·May 30, 2023Read full story