Entry № 041-3 / V-1185 · 0:00 synced

NVIDIA Made a CPU.. I’m Holding It. - Grace CPU/Hopper SuperChip @ Computex 2023

Linus Tech Tips@LinusTechTips2.6M viewsMay 29, 202311:16
Source
YT
Views
2.6M
Subscribers
16.8M
Critic
?
Audience
?

0 up · 0 down · 0 ratings

Promos

Try Pulseway FREE today, and make IT monitoring simple at: lmg.gg I'm at the Gigabyte booth at Computex 2023 where they're showing off bonkers new hardware from Nvidia! Discuss on the forum: linustechtips.com Immersion tank 1 A1P0-EB0 (rev. 100) :gigabyte.com Immersion tank 2 A1O3-CC0 (rev. 100): gigabyte.com Big AI server (h100) - G593-SD0 (rev. AAX1): gigabyte.com ► GET MERCH: lttstore.com ► LTX 2023 TICKETS AVAILABLE NOW: lmg.gg ► GET EXCLUSIVE CONTENT ON FLOATPLANE: lmg.gg ► SPONSORS, AFFILIATES, AND PARTNERS: lmg.gg ► EQUIPMENT WE USE TO FILM LTT: lmg.gg ► OUR WAN PODCAST GEAR: lmg.gg FOLLOW US --------------------------------------------------- Twitter: twitter.com Facebook: @LinusTech Instagram: @linustech TikTok: @linustech Twitch: twitch.tv MUSIC CREDIT --------------------------------------------------- Intro: Laszlo - Supernova Video Link: youtube.com iTunes Download Link: itunes.apple.com Artist Link: soundcloud.com Outro: Approaching Nirvana - Sugar High Video Link: youtube.com Listen on Spotify: spoti.fi Artist Link: youtube.com Intro animation by MBarek Abdelwassaa @mbarek_abdel Monitor And Keyboard by vadimmihalkevich / CC BY 4.0 geni.us Mechanical RGB Keyboard by BigBrotherECE / CC BY 4.0 geni.us Mouse Gamer free Model By Oscar Creativo / CC BY 4.0 geni.us CHAPTERS --------------------------------------------------- 0:00 Intro 0:22 Meet the Grace Super Chip! 1:22 We got permission for this... 3:13 ..but not for this. 4:40 Now for the GPU! 6:13 That's where the Interconnect comes in 7:32 There's "old-fashioned GPUs" too 8:35 Crazy network card 11:00 outro

Start
AI OverviewDefault language

NVIDIA showcased the Grace CPU/Hopper SuperChip at Computex 2023, revealing a data center oriented design that centers around ARM-based processing combined with high bandwidth interconnects. The display module at Gigabyte features a Grace SuperChip with 72 ARM cores per CPU and NVLink chip-to-chip interconnect, enabling a 144 core count per node and the possibility to deploy four such modules in a single 2U server chassis for a total of 576 cores. The presentation emphasizes ARM advantages like power efficiency while acknowledging the need for software and ecosystem adaptation for ARM in traditional PC markets, and highlights how data-center customers, including big cloud providers, may migrate to ARM-based architectures to optimize performance per watt. A notable pairing discussed is Grace Hopper, which combines a Grace CPU with an H100 Hopper GPU via NVLink, delivering extremely high CPU-GPU bandwidth and substantial memory on-die through both LPDDR5X for the CPU and HBM3 for the GPU. NVIDIA positions this stack as a path to massive scale, mentioning potential memory and bandwidth advantages that could support multi-GPU deployments and large AI/data-center workloads, while also noting power and cost considerations in accelerated environments. The segment concludes with a nod to traditional GPU configurations and faster networking components like the ConnectX-7 and BlueField-3, underscoring NVIDIA’s strategy to offer both ARM-based accelerators and classic PCIe/HGX options for different customers, and teasing the broader ecosystem around AI, HPC, and data-center acceleration. The video provides concrete figures such as 900 gigabytes per second of CPU-GPU interconnect bandwidth between Grace and Hopper modules, versus 64 GB/s peak for a GPU on a standard PCIe Gen5 lane, and up to 4 TB/s memory bandwidth on the GPU side with HBM3. It emphasizes that HBM memory remains a premium resource, limiting capacity to 96 GB on the H100 in this setup, while still allowing a practical, transparent memory access model via NVLink. The presenter also notes that ARM’s software and ecosystem requirements are non-trivial barriers in consumer markets, but in data centers where developers often optimize software for their own needs, the ARM-based Grace architecture could offer compelling economics and performance for large-scale AI and analytics tasks. The narrative also reflects a broader industry shift toward disaggregated and tightly integrated accelerators, where memory proximity, high bandwidth interconnects, and specialized networking components contribute to overall system efficiency and throughput. Overall, the coverage frames the Grace and Grace Hopper combination as a bold, high-end entry into the data-center race for efficiency and scale, while acknowledging practical considerations like software support, cost, and the ongoing coexistence with traditional x86 configurations. The host commentary also hints at future implications for software licensing and per-core pricing in cloud environments, suggesting that offloading tasks to specialized network and compute resources could reshape licensing models and revenue strategies in large-scale deployments. The piece also teases future iterations and the potential for even denser, more capable GPU/CPU pairings, signaling NVIDIA’s intent to maintain leadership in AI workloads and HPC infrastructure through a hybrid ARM-GPU strategy and advanced interconnects.

Topics · technology · data_center · ai · cpu · gpu · arm · hpc · servers

Questions answered

What is the Grace CPU/Hopper SuperChip configuration and how is it connected?
The Grace CPU/Hopper SuperChip combines ARM-based Grace CPUs with NVIDIA H100 Hopper GPUs using NVLink chip-to-chip interconnects, enabling high bandwidth between CPU and GPU. A Grace module may include 72 CPU cores per Grace CPU and, when combined with the HP Hopper GPU, can deliver up to 900 GB/s of CPU-GPU interconnect bandwidth.
Why is ARM architecture used in this data-center setup and what are the implications for software?
ARM offers higher power efficiency, which is beneficial in dense data-center deployments. However, software and operating systems must be compiled and optimized for ARM, which presents a backward-compatibility hurdle for PC-focused markets but is less prohibitive for data centers where custom software is common.
What are the memory capabilities and why do they matter for AI workloads?
The Grace CPU can access up to 480 GB of LPDDR5X memory per CPU, and the Hopper GPU uses HBM3 memory running up to 4 TB/s, with the system designed so memory is on the package near the GPU for fast access. This proximity and bandwidth are crucial for large AI datasets and high-throughput compute, enabling substantial performance gains for AI training and inference.