OpenAI Trains GPT-5-Thinking to Confess to Misbehaviour
A team at OpenAI has devised an approach that gets their AI systems to produce confessions—self-generated explanations where the model reflects on its actions and admits to any questionable conduct. Understanding misleading behaviours in large language models – such as hallucination, dishonesty or manipulation – has become one of the most