The Salt - Curated AI

The Salt - Curated AI

Share this post

The Salt - Curated AI
The Salt - Curated AI
CriticGPT: How OpenAI Is Improving GPT-4 with GPT-4

CriticGPT: How OpenAI Is Improving GPT-4 with GPT-4

And why GPT-4 is getting better at coding tasks

Benjamin Marie's avatar
Benjamin Marie
Jul 10, 2024
∙ Paid
2

Share this post

The Salt - Curated AI
The Salt - Curated AI
CriticGPT: How OpenAI Is Improving GPT-4 with GPT-4
Share
Generated with DALL-E

The GPT-4 models, including the models powering ChatGPT, are designed to be helpful and interactive, using a technique called “Reinforcement Learning from Human Feedback” (RLHF). In RLHF, humans rate and compare different ChatGPT responses to gather valuable feedback that is then used to improve the GPT models.

As improvements in reasoning and model behavior are made through RLHF, ChatGPT becomes more precise, and its errors become less obvious. This increased accuracy makes it difficult for humans to detect mistakes. This poses a significant challenge to RLHF, as aligning models may become harder when they surpass human knowledge.

The Salt - Curated AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, I review how OpenAI trained LLM critics, such as CriticGPT, to generate critiques that point out inaccuracies in ChatGPT’s responses, assisting humans in the RLHF pipeline.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Benjamin Marie
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share