Key Highlights
- Alphabet introduced its 8th-generation tensor processing units: TPU 8t optimized for training and TPU 8i for inference workloads
- The inference-focused TPU 8i provides 80% superior performance-per-dollar compared to its predecessor, Ironwood
- Both processors were developed in partnership with Broadcom and engineered in collaboration with Google DeepMind
- The TPU 8t training processor can scale up to 9,600 chips and offers double the interchip bandwidth of Ironwood
- Google Cloud customers will gain access to both chip variants later in 2025
Alphabet’s Google division introduced two specialized artificial intelligence processors on Wednesday, marking the first time its tensor processing unit architecture has been separated into distinct chips for training and inference operations.
The TPU 8t is engineered specifically for AI model training, while its counterpart, the TPU 8i, focuses exclusively on inference—the process of deploying trained models in real-world applications. Broadcom served as the co-development partner, extending a collaboration that has spanned more than ten years.
Alphabet Inc., GOOGL
This represents a strategic pivot from previous approaches. Earlier TPU iterations combined both training and inference capabilities within a single processor. Google attributes this change to the emergence of agentic AI systems—autonomous models that operate in continuous feedback loops with minimal human oversight—which require more purpose-built silicon.
“With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving,” explained Amin Vahdat, Google’s Senior Vice President and Chief Technologist for AI and Infrastructure.
The inference-oriented TPU 8i packs 384 megabytes of SRAM per processor—three times the capacity of Ironwood. According to Google, this architectural enhancement eliminates the “waiting room” bottleneck, reducing latency spikes that occur when multiple users simultaneously query a model.
Inference Capabilities See Dramatic Improvements
Compared to Ironwood, the TPU 8i achieves 80% better cost efficiency. In operational terms, organizations can accommodate nearly double the user demand without increasing their budget.
The chip also demonstrates up to 2x improved energy efficiency per watt, enabled by dynamic power management technology that modulates energy consumption based on real-time workload requirements.
For the first time, both processors utilize Google’s Axion CPU as the host processor, enabling optimization at the system architecture level rather than limiting improvements to individual chip performance.
Regarding training capabilities, the TPU 8t superpod configuration supports clusters of up to 9,600 processors with 2 petabytes of high-bandwidth memory. This represents double the interchip communication bandwidth of Ironwood, and Google claims it can compress frontier model development timelines from months to mere weeks.
The training processor delivers 2.8 times the computational performance of the seventh-generation Ironwood architecture at an equivalent price point.
Early Adopters and Industry Impact
Early adoption is gaining momentum. Citadel Securities developed quantitative research platforms using Google’s TPU infrastructure. All seventeen United States Department of Energy national laboratories operate AI co-scientist applications on the processors. Anthropic has made commitments to utilize multiple gigawatts of Google TPU computing capacity.
Analysts at DA Davidson projected in September that Google’s TPU division, when combined with Google DeepMind, could command a valuation approaching $900 billion.
Google maintains an exclusive distribution model for TPUs—they are not available for direct purchase and can only be accessed through Google Cloud services. Nvidia continues to supply GPU hardware to Google, and the company confirmed it will be among the initial cloud service providers offering Nvidia’s forthcoming Vera Rubin platform when it launches later this year.
The processors were engineered in close collaboration with Google DeepMind, which has deployed them to train Gemini language models and optimize algorithms powering Search and YouTube platforms.
Google announced that both the TPU 8t and TPU 8i will reach general availability for cloud platform customers later in 2025.
The post Alphabet (GOOGL) Unveils Dual-Purpose 8th-Gen TPU Chips Developed With Broadcom appeared first on Blockonomi.