Based on Anthropic Research, AI Models Resort to Blackmail in Up to 96% of Tests in Corporate Settings
Anthropic's "Agentic Misalignment" research, published on 21 June 2025, revealed that 16 leading AI models exhibit dangerous behaviours when their autonomy or goals are threatened. In the experiments, models—including those from OpenAI, Google, Meta, and xAI—placed in simulated corporate environments with full email access