The Salt - Curated AI
Subscribe
Sign in
Home
Notes
AI Notebooks
AI Repositories
Related Articles
deep dive
Archive
About
Latest
Top
Discussions
Improving Generalization of MoEs with Routing Manifold Alignment
The Weekly Salt #95
Nov 12
•
Benjamin Marie
2
INT vs FP Data Types for Quantization
The Weekly Salt #94
Nov 5
•
Benjamin Marie
2
October 2025
Reasoning with Random Resampling to Match RL's Accuracy
The Weekly Salt #93
Oct 29
•
Benjamin Marie
1
A Recipe to Convert Existing Models Into Fine-Tuned BitNet
The Weekly Salt #92
Oct 22
•
Benjamin Marie
4
When Quantization Improves Reinforcement Learning
The Weekly Salt #91
Oct 15
•
Benjamin Marie
7
Could GRPO Be an "Off-Policy" Algorithm?
The Weekly Salt #90
Oct 9
•
Benjamin Marie
8
Pre-Training Updates: NVFP4 and Thinking Augmented
The Weekly Salt #89
Oct 1
•
Benjamin Marie
2
September 2025
How Poor SFT Data Overwrites Learned Knowledge
The Weekly Salt #87
Sep 24
•
Benjamin Marie
5
Jet-Nemotron: Searching for the Best Attention Architecture
DeltaNet + Hardware-aware Search
Sep 23
•
Benjamin Marie
2
1
MMBERT as a Drop-in Successor to XLM-R
The Weekly Salt #86
Sep 17
•
Benjamin Marie
1
LLMs Hallucinate and That's a Benchmarking Problem
The Weekly Salt #85
Sep 10
•
Benjamin Marie
4
What Breaks When You Quantize for Translation? A Deep Dive Across 55 Languages
Evaluating LLM translation under quantization with COMET, BLEU, GGUF models, and more
Sep 8
•
Benjamin Marie
1
2
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts