Qwen3 Technical Report: Reasoning in Pre-Training and Post-Training

Plus a Brief Look at the Limitations of the Multilingual Evaluation

May 16, 2025

∙ Paid

Qwen3 was released last month, and I can confirm that Qwen3 is just as easy to use as Qwen2.5, while offering better performance on many tasks.

How Well Does Qwen3 Handle 4-bit and 2-bit Quantization?

Benjamin Marie

May 1

Read full story

Fine-Tuning Qwen3: Base vs. Reasoning Models

Benjamin Marie

May 8

Read full story

One recurring complaint I have about the model is its verbosity. It often produces unnecessarily long responses, even when reasoning is turned off. The Qwen3 technical report, released this week, helps explain why: the model is pre-trained to reason.

In this article, I review the technical report and highlight the main design choices behind Qwen3. Architecturally, the models are quite similar to Qwen2.5. The key differences lie in the multi-stage pre-training and post-training pipelines. I’ll also dedicate the final section to some critical thoughts on the multilingual evaluation.

The Salt - Curated AI

Qwen3 Technical Report: Reasoning in Pre-Training and Post-Training

Plus a Brief Look at the Limitations of the Multilingual Evaluation

How Well Does Qwen3 Handle 4-bit and 2-bit Quantization?

Fine-Tuning Qwen3: Base vs. Reasoning Models

This post is for paid subscribers