Entry № 041-3 / V-10 · 0:00 synced

The Biggest Test Bench I’ve Ever Seen

Linus Tech Tips@LinusTechTips446.6K viewsJun 6, 202611:14
Source
YT
Views
446.6K
Subscribers
16.8M
Critic
?
Audience
?

0 up · 0 down · 0 ratings

Description

Check out ASUS's next gen ASUS AI POD with NVIDIA Vera Rubin NVL72: asus.com Learn more about NVIDIA DSX: nvidia.com A test bench usually means power, cooling, and tools at your fingertips. But what do you do when the hardware you need to validate pulls over 100,000 watts? ASUS sponsored us to tour their data center R&D lab, where they develop and torture test their enterprise AI server hardware. We got our first look under the hood of a liquid cooled GB300, stepped inside a 40C server oven, and found out why their chiller is less chill than it should be. Read the Labs article: lttlabs.com Discuss on the forum: linustechtips.com

Channels and socials

Check out our Channel Partners: Secretlab - Grab a TITAN Evo ergonomic gaming chair: lmg.gg PIA - Get the VPN of our choice: piavpn.com dbrand - Buy a "Circuit" series skin for your device: dbrand.com ► SHOP LTT PRODUCTS: lttstore.com ► GET EXCLUSIVE CONTENT ON FLOATPLANE: lmg.gg ► DIVE DEEPER ON THE LTT LABS WEBSITE: lmg.gg ► SPONSORS, AFFILIATES, AND PARTNERS: lmg.gg Purchases made through some store links may provide some compensation to Linus Media Group. Affiliate links powered in part by affilimate.com Linus Sebastian is an investor in Framework Computer, Inc and HexOS by Eshtek. CHAPTERS --------------------------------------------------- 0:00 Intro 1:10 R&D Lab 2:00 Liquid-Cooled GB300 4:15 Back in the Lab 6:25 Chiller 7:45 Environmental Chamber 6:15 "2nd Thermal Lab" 9:55 Software 11:02 Outro

Start
AI OverviewDefault language

The video provides a deep tour of ASUS's R&D data center lab, revealing how a test bench scales up to validate enterprise AI hardware. The host starts by framing the purpose of a test bench: to provide power, cooling, and essential tools in a controlled environment so that devices under test can be swapped quickly and measured against prior models or competing solutions. The centerpiece is a Grace Blackwell GB300 compute node with dual Nvidia Bianca boards, each housing a 72-core Grace CPU and two Blackwell Ultra GPUs with massive HBM3e memory, drawing up to about 1,400 watts per GPU and contributing to a total power budget around 8,000 watts in the rack. The discussion covers the shift from air cooling to liquid cooling, the role of the coolant distribution unit under the floor, and the importance of leak sensors and integrated software for monitoring and control, all essential for handling the high power demands of modern AI hardware. The host notes that Vera Rubin, the next generation from ASUS, will be even more power hungry and heavier, nearly two tons per rack, which necessitates floor reinforcements and upgraded infrastructure. Throughout, the tour emphasizes how close collaboration with mass production is crucial for reproducing and debugging issues quickly, aided by software suites for planning, deployment, and ongoing management of hardware, firmware, and networks. The segment also highlights environmental and thermal testing tools, including an extreme 100,000-watt environmental chamber and a dedicated 45 C validation chamber, illustrating how such equipment supports long-term aging tests and dynamic loading scenarios. The tour closes by contrasting the lab's flexible, test-focused environment with how the broader data center ecosystem uses software like AIDC, which enables lifecycle management from design to deployment, and custom dashboards for tracking metrics such as power, cooling, carbon emissions, and service quality, underscoring the lab’s role in shaping reliable, scalable enterprise AI deployments.

Topics · data centers · hardware testing · ai hardware · server hardware · cooling systems

Questions answered

What is the Grace Blackwell GB300 and why is it significant in ASUS's lab tour?
The Grace Blackwell GB300 is a compute node that combines dual Nvidia Bianca boards with a 72 core Grace CPU and two Blackwell Ultra GPUs with large HBM3e memory. It consumes up to about 1,400 watts per GPU, contributing to a rack power budget around 8,000 watts. It exemplifies the high power and data throughput needs of enterprise AI hardware, and its cooling and power supply design demonstrate how ASUS validates performance under demanding conditions.
How does ASUS validate the reliability and cooling of high-power hardware in the lab?
ASUS uses a combination of liquid cooling with a coolant distribution unit, leak sensors, and advanced monitoring software to simulate real-world workloads. They employ long-term aging and environmental chambers to test thermal stability across wide temperature ranges and dynamic loads, plus software tools like AIDC for planning, deployment, and monitoring of systems throughout their lifecycle.