xAI Launches Grok 4 Fast: A Leap in Cost-Efficient AI



James Ding
Sep 19, 2025 21:46

xAI introduces Grok 4 Fast, advancing cost-efficient reasoning models with superior token efficiency and performance, offering a unified architecture for enterprise and consumer applications.



xAI Launches Grok 4 Fast: A Leap in Cost-Efficient AI

Introduction to Grok 4 Fast

xAI has unveiled Grok 4 Fast, a groundbreaking advancement in cost-efficient reasoning models. Building on the successes of Grok 4, this new model offers exceptional token efficiency, making high-quality reasoning more accessible to developers and users across various domains. Grok 4 Fast integrates state-of-the-art cost-efficiency with advanced web and X search capabilities, featuring a 2M token context window and a unified architecture for both reasoning and non-reasoning modes.

Performance and Efficiency

According to xAI, Grok 4 Fast surpasses its predecessor, Grok 3 Mini, in reasoning benchmarks, achieving similar performance to Grok 4 while reducing token usage by 40%. This efficiency results in a 98% reduction in the cost to achieve the same performance on frontier benchmarks. The model’s enhanced intelligence density is verified by an independent review from Artificial Analysis, showcasing a superior price-to-intelligence ratio.

Advanced Capabilities

Grok 4 Fast is engineered with large-scale reinforcement learning, optimizing its tool-use capabilities. The model excels in deciding when to utilize tools like code execution or web browsing, boasting advanced agentic search capabilities. It can seamlessly browse the web, accessing real-time data and synthesizing information at high speeds, setting a new standard for cost-effective intelligence across general domains.

Benchmark Success

The model’s prowess is evident in LMArena’s Search Arena, where Grok 4 Fast, under the code name ‘menlo’, secured the top position with an Elo score of 1163, outperforming its nearest competitor by a significant margin. In the Text Arena, Grok 4 Fast ranks eighth, demonstrating its superior intelligence density compared to larger models.

Unified Architecture

Grok 4 Fast introduces a unified architecture that combines reasoning and non-reasoning capabilities within the same model weights, reducing latency and token costs. This architecture allows for real-time applications, making it ideal for both simple and complex queries. Developers using the xAI API can fine-tune the model’s behavior to optimize for speed or depth.

Availability and Pricing

Grok 4 Fast is now available to all users, including free users, marking a step towards democratizing advanced AI. The model is offered in two versions: ‘grok-4-fast-reasoning’ and ‘grok-4-fast-non-reasoning’, each supporting a 2M token context window. Pricing varies based on token usage, with input tokens costing $0.20 per million for less than 128k tokens and $0.40 per million for 128k tokens or more. Output tokens are priced at $0.50 per million for less than 128k tokens and $1.00 per million for higher usage.

For further information, the Grok 4 Fast model card is available on the xAI website. xAI plans to continue enhancing Grok 4 Fast based on user feedback, with future integrations including multimodal capabilities and agentic features.

For more details, visit the xAI official announcement.

Image source: Shutterstock


Source: https://blockchain.news/news/xai-launches-grok-4-fast-cost-efficient-ai