The Salt - Curated AI

The Salt - Curated AI

Home
Notes
AI Notebooks
AI Repositories
Related Articles
deep dive
Archive
About
Improving Generalization of MoEs with Routing Manifold Alignment
The Weekly Salt #95
Nov 12 • 
Benjamin Marie
INT vs FP Data Types for Quantization
The Weekly Salt #94
Nov 5 • 
Benjamin Marie

October 2025

Reasoning with Random Resampling to Match RL's Accuracy
The Weekly Salt #93
Oct 29 • 
Benjamin Marie
A Recipe to Convert Existing Models Into Fine-Tuned BitNet
The Weekly Salt #92
Oct 22 • 
Benjamin Marie
When Quantization Improves Reinforcement Learning
The Weekly Salt #91
Oct 15 • 
Benjamin Marie
Could GRPO Be an "Off-Policy" Algorithm?
The Weekly Salt #90
Oct 9 • 
Benjamin Marie
Pre-Training Updates: NVFP4 and Thinking Augmented
The Weekly Salt #89
Oct 1 • 
Benjamin Marie

September 2025

How Poor SFT Data Overwrites Learned Knowledge
The Weekly Salt #87
Sep 24 • 
Benjamin Marie
Jet-Nemotron: Searching for the Best Attention Architecture
DeltaNet + Hardware-aware Search
Sep 23 • 
Benjamin Marie
MMBERT as a Drop-in Successor to XLM-R
The Weekly Salt #86
Sep 17 • 
Benjamin Marie
LLMs Hallucinate and That's a Benchmarking Problem
The Weekly Salt #85
Sep 10 • 
Benjamin Marie
What Breaks When You Quantize for Translation? A Deep Dive Across 55 Languages
Evaluating LLM translation under quantization with COMET, BLEU, GGUF models, and more
Sep 8 • 
Benjamin Marie
© 2025 Benjamin Marie
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture