The Salt - Curated AI
Subscribe
Sign in
Home
Notes
AI Notebooks
AI Repositories
Related Articles
deep dive
Archive
About
Latest
Top
Discussions
Reward Correct CoT for Better Reasoning Models
The Weekly Salt #73
Jun 19
•
Benjamin Marie
1
Share this post
The Salt - Curated AI
Reward Correct CoT for Better Reasoning Models
Copy link
Facebook
Email
Notes
More
Magistral: Advancing Reasoning with Efficient GRPO Training
No More KL Penalty, No Need for a Reference Model
Jun 12
•
Benjamin Marie
1
Share this post
The Salt - Curated AI
Magistral: Advancing Reasoning with Efficient GRPO Training
Copy link
Facebook
Email
Notes
More
Better Data Recipes for Pre-training LLMs and Training Reasoning Models
The Weekly Salt #72
Jun 11
•
Benjamin Marie
2
Share this post
The Salt - Curated AI
Better Data Recipes for Pre-training LLMs and Training Reasoning Models
Copy link
Facebook
Email
Notes
More
Reasoning Models Are More Prone to "Hallucination"
The Weekly Salt #71
Jun 4
•
Benjamin Marie
1
Share this post
The Salt - Curated AI
Reasoning Models Are More Prone to "Hallucination"
Copy link
Facebook
Email
Notes
More
May 2025
End-to-End FP4 Training for LLMs with Blackwell GPUs
The Weekly Salt #70
May 28
•
Benjamin Marie
3
Share this post
The Salt - Curated AI
End-to-End FP4 Training for LLMs with Blackwell GPUs
Copy link
Facebook
Email
Notes
More
How to Teach LLMs When to Think
The Weekly Salt #69
May 21
•
Benjamin Marie
4
Share this post
The Salt - Curated AI
How to Teach LLMs When to Think
Copy link
Facebook
Email
Notes
More
Qwen3 Technical Report: Reasoning in Pre-Training and Post-Training
Plus a Brief Look at the Limitations of the Multilingual Evaluation
May 16
•
Benjamin Marie
6
Share this post
The Salt - Curated AI
Qwen3 Technical Report: Reasoning in Pre-Training and Post-Training
Copy link
Facebook
Email
Notes
More
LLM Alignment: On-Policy vs. Off-Policy Training Data
The Weekly Salt #68
May 14
•
Benjamin Marie
4
Share this post
The Salt - Curated AI
LLM Alignment: On-Policy vs. Off-Policy Training Data
Copy link
Facebook
Email
Notes
More
A Rectified Softmax and a New Bi-Level Adaptive Reasoning Optimization
The Weekly Salt #67
May 7
•
Benjamin Marie
2
Share this post
The Salt - Curated AI
A Rectified Softmax and a New Bi-Level Adaptive Reasoning Optimization
Copy link
Facebook
Email
Notes
More
Nemotron-H: The Mamba/Transformer Models by NVIDIA
How can this work?
May 2
•
Benjamin Marie
1
Share this post
The Salt - Curated AI
Nemotron-H: The Mamba/Transformer Models by NVIDIA
Copy link
Facebook
Email
Notes
More
April 2025
Initialize DPO with a DPO Model for Better Preference Optimization
The Weekly Salt #66
Apr 29
•
Benjamin Marie
3
Share this post
The Salt - Curated AI
Initialize DPO with a DPO Model for Better Preference Optimization
Copy link
Facebook
Email
Notes
More
Lossless Compression for LLMs with Dynamic-Length Float
The Weekly Salt #65
Apr 22
•
Benjamin Marie
3
Share this post
The Salt - Curated AI
Lossless Compression for LLMs with Dynamic-Length Float
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts