TLDR
- AWS is set to roll out Cerebras’s Wafer-Scale Engine chips in its data centers to support AI inference tasks.
- The agreement spans multiple years, though financial details have not been made public.
- Cerebras asserts that its chips can handle inference tasks up to 25 times quicker than Nvidia’s GPUs.
- In January 2026, OpenAI inked a separate deal with Cerebras valued at more than $10 billion.
- Cerebras secured $1 billion in funding in February 2026, putting the startup’s valuation at approximately $23 billion.
Amazon Web Services (AWS) has entered into a multiyear partnership with chip startup Cerebras Systems to install its Wafer-Scale Engine processors in AWS data centers. These chips will be utilized specifically for AI inference, which refers to the process where an AI model generates responses to user queries.
$AMZN’s AWS and Cerebras are partnering to bring faster AI inference to Amazon Bedrock in the coming months.
The setup will combine Trainium for prompt processing with Cerebras CS-3 for token generation, with Bedrock becoming the first cloud service to offer Cerebras’… pic.twitter.com/LT6yoDEmCB
— Wall St Engine (@wallstengine) March 13, 2026
AWS is the world’s leading cloud service provider. Traditionally, it has relied on its proprietary in-house chips called Trainium, developed by its semiconductor division Annapurna Labs. As part of the new agreement, AWS intends to combine Trainium with Cerebras chips to create a more rapid inference solution.
Amazon.com, Inc. (AMZN)

Cerebras states that its Wafer-Scale Engine is capable of managing the “decode” stage of inference (the phase where the model produces its actual response) up to 25 times faster than Nvidia’s GPUs.
This service will be marketed as a premium option. “If you’re okay with slower inference, there are more affordable alternatives available,” noted Cerebras CEO Andrew Feldman. AWS has confirmed that it will continue to provide lower-cost inference solutions using only Trainium.
A Significant Week for Cerebras
This partnership follows closely on the heels of OpenAI’s separate deal with Cerebras in January 2026, said to be worth over $10 billion. That agreement aims to power OpenAI’s ChatGPT using Cerebras processors, with OpenAI looking to deploy up to 750 megawatts of computing power.
In February 2026, Cerebras raised $1 billion in a fresh funding round, pushing its total raised capital to $2.6 billion and giving the company an approximate valuation of $23 billion. Investors include Fidelity Management, Benchmark, Tiger Global, and Coatue.
Cerebras submitted an IPO application in September 2024 but pulled the filing about a year afterward.
Nvidia Faces Increasing Pressure
The AWS-Cerebras collaboration adds to the expanding set of challenges Nvidia faces in the inference market. The AI sector has been moving away from model training (an area where Nvidia’s GPUs hold a dominant position) toward inference workloads that require greater speed.
Nvidia isn’t remaining idle. In December 2025, it entered into a $20 billion licensing agreement with chip startup Groq. Additionally, the company plans to launch a new processing system based on Groq’s technology in the coming months.
AWS’s Nafea Bshara, co-founder of Annapurna Labs, stated that the Cerebras partnership is centered on boosting speed and reducing costs. “Our role is to increase speed and cut prices,” he commented.
At the time this report was written, Amazon’s AMZN stock had declined by 0.44%.