NVIDIA’s Vera Rubin Changes AI Forever as NVIDIA Enters Full Production, and Top AI Cloud Providers Race to Offer Rubin-Powered Systems Worldwide.
Interoduction
The next big leap in artificial intelligence computing is officially underway, and some of the world’s most important AI infrastructure players are lining up early. Nebius, Supermicro, and CoreWeave have all confirmed plans to roll out Nvidia’s powerful new Vera Rubin computing platform, marking a major shift in how next-generation AI models will be built, trained, and deployed.
With Nvidia confirming that Vera Rubin has entered full production and cloud providers targeting launches in late 2026, the race to dominate advanced AI infrastructure is heating up fast.
Nebius Brings Vera Rubin to the U.S. and Europe
Nebius (NBIS) was among the first to announce concrete deployment plans. The Dutch-based AI infrastructure provider said it will begin offering Nvidia’s Vera Rubin NVL72 graphics processing units across the U.S. and Europe starting in the second half of 2026.
According to the company, the new platform will be rolled out through both Nebius AI Cloud and its enterprise-focused inference service, Nebius Token Factory. This move positions Nebius as one of the earliest AI cloud providers to deliver Rubin-based computing to customers.
Nebius plans to integrate the Vera Rubin NVL72 across its full-stack infrastructure at data centers on both continents. This setup is designed to give customers regional availability, more control over their workloads, and the ability to build advanced AI applications closer to where their data lives.
Founder and CEO Arkady Volozh said the integration is about speed and efficiency for the next wave of AI systems.
By combining Vera Rubin with Nebius AI Cloud and Token Factory, the company aims to give startups and enterprises the tools they need to develop agentic AI and advanced reasoning systems faster and with fewer bottlenecks.
Nebius describes Token Factory as an enterprise-ready platform focused on inference and post-training workloads, making it well suited for deploying complex AI models at scale.
The company also emphasized that Rubin will not replace its existing offerings. Instead, the new platform will complement Nebius’ current Nvidia GB200 NVL72 and Grace Blackwell Ultra NVL72 capacity, giving customers more flexibility when choosing the right hardware for their needs.

Supermicro and CoreWeave Join the Rubin Wave
Nebius isn’t alone in embracing Nvidia’s next-generation platform.
Supermicro (SMCI) separately confirmed support for Nvidia’s Vera Rubin NVL72 and HGX Rubin NVL8 computing platforms. Known for its server and data center hardware, Supermicro’s backing signals strong ecosystem support for Rubin across enterprise and hyperscale environments.
Meanwhile, AI cloud provider CoreWeave (CRWV) announced that it will also add Rubin technology to its platform. The company says this expansion will give customers more options when building and deploying agentic AI systems, reasoning models, and large-scale inference workloads.
CoreWeave expects to be among the first cloud providers to deploy the Nvidia Rubin platform when it becomes available in the second half of 2026, putting it in direct competition with other high-performance AI infrastructure providers.
Together, these announcements show that Rubin isn’t just another chip launch—it’s shaping up to be a foundational platform for the next generation of AI computing.
Nvidia Confirms Vera Rubin Is in Full Production
Adding momentum to the rollout, Nvidia CEO Jensen Huang confirmed earlier this week that Vera Rubin has officially entered full production.
That milestone means the platform is moving beyond development and into large-scale manufacturing, paving the way for cloud providers and enterprises to begin planning real-world deployments.
The update follows Nvidia’s high-profile unveiling of the Vera Rubin superchip at CES 2026 in Las Vegas, where the company positioned Rubin as the successor to its Grace Blackwell platform.
Inside Nvidia’s Vera Rubin Superchip
At the heart of the Rubin platform is the Vera Rubin superchip itself. One of six chips that make up the full Rubin system, the superchip combines one Vera CPU and two Rubin GPUs into a single processor.
Nvidia says this tightly integrated design is built specifically for modern AI workloads, especially agentic AI, advanced reasoning models, and mixture-of-experts (MoE) architectures.
MoE models work by combining multiple specialized “expert” AIs and dynamically routing queries to the most relevant one, depending on what a user asks. This approach can dramatically improve efficiency and performance—but only if the underlying hardware can keep up.
According to Huang, the timing couldn’t be better.
AI computing demand for both training and inference is surging, and Nvidia believes Rubin represents a major step forward in meeting that demand. The company says its annual cadence of releasing new AI supercomputers, combined with deep hardware-software codesign across six chips, allows Rubin to push into entirely new performance territory.
Investors appeared to welcome the news, with Nvidia shares edging slightly higher in premarket trading following the launch.
A Full Platform, Not Just a Chip
What sets Rubin apart is that it’s not just a single processor—it’s a complete computing platform.
Alongside the Vera CPU and Rubin GPUs, the Rubin platform includes four additional networking and storage chips:
Nvidia NVLink 6 Switch
Nvidia ConnectX-9 SuperNIC
Nvidia BlueField-4 DPU
Nvidia Spectrum-6 Ethernet Switch
Together, these components are designed to handle the massive data movement and communication demands of today’s largest AI models.
All of this technology can be packaged into Nvidia’s Vera Rubin NVL72 server, which combines 72 GPUs into one system. When multiple NVL72 servers are linked together, they form Nvidia’s DGX SuperPOD—essentially a massive AI supercomputer.
These are the kinds of systems hyperscalers like Microsoft, Google, Amazon, and Meta are spending billions of dollars to deploy as they race to stay ahead in AI.

Faster, Cheaper, and More Efficient AI
NVIDIA is also introducing a new AI storage solution called NVIDIA Inference Context Memory Storage, designed to store and share the enormous amounts of data generated by trillion-parameter models and multi-step reasoning systems.
Efficiency is a major selling point of Rubin compared to previous generations.
According to Nvidia, Rubin can reduce the number of GPUs needed to train the same mixture-of-experts model by up to four times compared to Grace Blackwell systems. Using fewer GPUs for the same job means companies can redeploy hardware to other tasks, improving overall utilization.
On the inference side, Nvidia claims Rubin delivers a 10x reduction in inference token costs.
Tokens are the basic building blocks AI models use to process information, including words, sentence fragments, images, and video. Token processing is extremely resource-intensive and consumes significant energy, especially at scale.
Lowering token costs could significantly reduce the total cost of ownership for AI systems running on Rubin, making advanced models more affordable to operate.
Why Vera Rubin Matters
With Nebius, Supermicro, and CoreWeave all committing early support—and Nvidia confirming full production—Vera Rubin is shaping up to be one of the most important AI platforms of the next decade.
For enterprises and cloud providers, Rubin promises faster training, cheaper inference, and better efficiency for the most demanding AI workloads. For Nvidia, it reinforces the company’s dominance at the center of the global AI infrastructure boom.
And for the AI industry as a whole, Vera Rubin may be the platform that finally makes agentic AI and advanced reasoning models scalable on a global level.
Late 2026 may still be months away, but the future of AI computing is already taking shape—and it has Nvidia’s name written all over it.
