xAI Launches Grok-4-Fast: Unified Reasoning Architecture with a 2 Million Token Context Window

xAI Launches Grok-4-Fast: Unified Reasoning Architecture with a 2 Million Token Context Window
Source: Trevor Cokley, Public domain, via Wikimedia Commons

xAI announced the Grok-4-Fast model in September 2025, which integrates "reasoning" and "non-reasoning" modes into a unified architecture whilst using 40% fewer thinking tokens than the Grok-4 model. The Grok-4-Fast can process up to 2 million text units at once and has been trained to use external tools effectively. As a result, it can browse the internet, run code, and operate other digital tools.

It is available through the xAI API in two versions: grok-4-fast-reasoning and grok-4-fast-non-reasoning, both priced at $0.20/1 million input tokens (<128k) and $0.50/1 million output tokens (<128k), whilst cached input tokens are available at $0.05/1 million. Grok-4-Fast delivers outstanding performance on mathematical and scientific reasoning tasks: 92.0% on the AIME 2025 competition (secondary school mathematics problems), 93.3% on the HMMT 2025 test (Harvard–MIT mathematics tournament), 85.7% on the GPQA Diamond assessment (doctoral-level scientific questions) and 80.0% on the LiveCodeBench programming test (January–May), whilst using on average 40% fewer thinking tokens than Grok-4.

According to an independent review by Artificial Analysis, Grok-4-Fast exhibits the best price-to-intelligence ratio amongst publicly available frontier models, and xAI reports a 98% reduction in price to achieve the same performance as Grok-4. According to xAI's published model card, Grok-4-Fast was trained using large-scale reinforcement learning to maximise intelligence density, with explicit post-training on tool use and safety demonstrations. Its unified model architecture is steerable via system prompts, thereby reducing end-to-end latency and token costs, making it ideal for real-time applications such as search and interactive coding.

Sources:

1.

xAI Logo
Grok 4 Fast — xAI
Grok 4 Fast delivers cost-efficient intelligence with a 2M token context window and unified reasoning architecture.

2.

What to Know About Grok 4 Fast for Enterprise Use Cases
Grok 4 Fast is a streamlined version of xAI’s flagship model

3.

xAI launches Grok-4-Fast: Unified Reasoning and Non-Reasoning Model with 2M-Token Context and Trained End-to-End with Tool-Use Reinforcement Learning (RL)
xAI’s Grok-4-Fast unifies reasoning/non-reasoning, 2M-token context, tool-use reinforcement learning RL, aggressive pricing