Anthropic's Innovation: the Claude 3.7 Sonnet Hybrid Reasoning Model

Anthropic's Innovation: the Claude 3.7 Sonnet Hybrid Reasoning Model
Source: DALL·E 3

Anthropic unveiled the Claude 3.7 Sonnet model on 24th February 2025, the first hybrid reasoning model on the market with a novel thinking capability. This new feature allows the model to dedicate more time and computational resources to solving complex problems whilst making the thinking process visible to users. Claude 3.7 Sonnet is available in all Claude subscription packages, but the hybrid reasoning mode can only be used in the paid versions.

Claude 3.7 Sonnet delivers outstanding performance in coding tasks, achieving 62.3% accuracy in the SWE-bench Verified test (a benchmark evaluating solutions to real software engineering problems), compared to OpenAI's o3-mini model's 49.3% result. Anthropic has significantly reduced unnecessary rejections by 45% (when the model unjustifiably refuses to fulfil user requests), making the model more likely to comply with users' requests. The hybrid reasoning mode has proven particularly useful for complex tasks: the ideal amount of thinking is context-dependent, with more extensive exploration often valuable in creative or philosophical conversations. API users can specify precisely how many tokens the model should use for thinking, up to the 128,000 token output limit.

Anthropic has also introduced Claude Code, a command-line AI assistant for developers, which is currently available in limited research preview. Claude 3.7 Sonnet is already available on the Claude website, in the Claude application, and through the Anthropic API via the Amazon Bedrock and Google Cloud Vertex AI platforms.

Sources:

1.

Claude’s extended thinking
Discussing Claude’s new thought process

2.

I tested Anthropic’s Claude 3.7 Sonnet. Its ‘extended thinking’ mode outdoes ChatGPT and Grok, but it can overthink.
Anthropic has launched its Claude 3.7 Sonnet AI model, featuring an “extended thinking” mode. Here’s how it compares to ChatGPT and Grok.

3.

Claude 3.7 Sonnet debuts with “extended thinking” to tackle complex problems
Anthropic’s first simulated reasoning model is a beast at coding tasks.