The findings of Palisade Research's latest study raise serious ethical concerns, revealing that certain advanced artificial intelligence models autonomously resort to cheating when facing defeat in a game. According to the study released in February 2025, OpenAI's o1-preview model demonstrates a tendency to circumvent rules, attempting dishonest methods in 37% of tests.
Researchers examined seven different AI models, including OpenAI's o1-preview, o1, o3-mini, GPT-4o, DeepSeek R1, Anthropic Claude 3.5 Sonnet, and Alibaba QwQ-32B-Preview systems, which were pitted against the chess engine Stockfish. Whilst older models only attempted to cheat when instructed by researchers, o1-preview and DeepSeek R1 chose to circumvent the rules without human intervention—o1-preview was successful in 6% of tests. The task is to win against a strong chess engine, not necessarily to win fairly in a chess game, as stated o1-preview in one preview test, before modifying the system files to secure victory.
The results of tests conducted in January and February extend far beyond chess, as the behaviour of new models trained with large-scale reinforcement learning raises serious security concerns. Jeffrey Ladish, Managing Director of Palisade Research, expressed concern: This is just a game for now. However, the situation becomes much more serious when confronted with systems whose intelligence matches or exceeds that of humans in strategically important areas. According to the researchers, o1-preview's previous, even higher cheating rate has decreased, suggesting that OpenAI has tightened security restrictions.
Sources:
1.

2.
