CFM has updated the domain of its english website from en.chinaflashmarket.com to www.memorymarket.com, Please be informed.

NVIDIA Vera Rubin Enters Full Production, Set to Slash AI Inference Costs by 10-Fold

By: CFM 2026-01-06 02:49 (UTC+0)

NVIDIA CEO Jensen Huang announced that the company's next-generation artificial intelligence platform, "Vera Rubin," has entered full-scale production and is scheduled to commence shipments to partners in the second half of this year.

Compared to the previous generation architecture, the number of transistors in the Rubin platform increased by only 1.6 times, yet it delivers a fivefold improvement in inference performance while reducing the cost per token by nine-tenths (or 90%).

Huang pointed out that NVIDIA broke from its traditional approach of updating only one or two chips per generation, instead completely redesigning all six chips from the ground up. These chips include the Vera central processing unit (CPU), the Rubin GPU, the NVLink switch, the ConnectX-9 SuperNIC, the BlueField-4 DPU, and the Spectrum-6 Ethernet switch.

The Vera CPU and Rubin GPU were not developed in isolation but were co-designed to enable high-speed, low-latency bidirectional data sharing . NVIDIA also designed the ConnectX-9 network card specifically for the Vera processor, and these technologies were announced together only after their integration was fully realized.

Huang emphasized that when training a model with 10 trillion parameters, the Vera Rubin system can complete the task in the same timeframe while occupying only a quarter of the physical space required by a Blackwell system. Furthermore, the processing capability per watt has improved by approximately 10 times compared to Blackwell, a factor directly linked to data center profitability. The Vera Rubin platform slashes AI inference costs by 10 times and supports "confidential computing," meaning all data is encrypted during transmission, storage, and computation.