AWS bets on liquid cooling for its AI servers

Date:

Share post:


It’s AWS re:Invent 2024 this week, Amazon’s annual cloud computing extravaganza in Las Vegas, and as is tradition, the company has so much to announce, it can’t fit everything into its five (!) keynotes. Ahead of the show’s official opening, AWS on Monday detailed a number of updates to its overall data center strategy that are worth paying attention to.

The most important of these is that AWS will soon start using liquid cooling for its AI servers and other machines, regardless of whether those are based on its homegrown Trainium chips and Nvidia’s accelerators. Specifically, AWS notes that its Trainium2 chips (which are still in preview) and “rack-scale AI supercomputing solutions like NVIDIA GB200 NVL72” will be cooled this way.

It’s worth highlighting that AWS stresses that these updated cooling systems can integrate both air and liquid cooling. After all, there are still plenty of other servers in the data centers that handle networking and storage, for example, that don’t require liquid cooling. “This flexible, multimodal cooling design allows AWS to provide maximum performance and efficiency at the lowest cost, whether running traditional workloads or AI models,” AWS explains.

The company also announced that it is moving to more simplified electrical and mechanical designes for its servers and server racks.

“AWS’s latest data center design improvements include simplified electrical distribution and mechanical systems, which enable infrastructure availability of 99.9999%. The simplified systems also reduce the potential number of racks that can be impacted by electrical issues by 89%,” the company notes in its announcement. In part, AWS is doing this by reducing the number of times the electricity gets converted on its way from the electrical network to the server.

AWS didn’t provide many more details than that, but this likely means using DC power to run the servers and/or HVAC system and avoiding many of the AC-DC-AC conversion steps (with their default losses) otherwise necessary.

“AWS continues to relentlessly innovate its infrastructure to build the most performant, resilient, secure, and sustainable cloud for customers worldwide,” said Prasad Kalyanaraman, vice president of Infrastructure Services at AWS, in Monday’s announcement. “These data center capabilities represent an important step forward with increased energy efficiency and flexible support for emerging workloads. But what is even more exciting is that they are designed to be modular, so that we are able to retrofit our existing infrastructure for liquid cooling and energy efficiency to power generative AI applications and lower our carbon footprint.”

In total, AWS says, the new multimodal cooling system and upgraded power delivery system will let the organization “support a 6x increase in rack power density over the next two years, and another 3x increase in the future.”

In this context, AWS also notes that it is now using AI to predict the most efficient way to position racks in the data center to reduce the amount of unused or underutilized power. AWS will also roll out its own control system across its electrical and mechanical devices in the data center, which will come with built-in telemetry services for real-time diagnostics and troubleshooting.

“Data centers must evolve to meet AI’s transformative demands,” said Ian Buck, vice president of hyperscale and HPC at Nvidia. “By enabling advanced liquid cooling solutions, AI infrastructure can be efficiently cooled while minimizing energy use. Our work with AWS on their liquid cooling rack design will allow customers to run demanding AI workloads with exceptional performance and efficiency.”



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Sony-Honda’s Afeela EV will start at $89,900

Sony has been trickling out details of the Afeela brand it launched with Honda, ever since it...

Anker shows off its solar beach umbrella at CES 2025

At CES 2025, Anker unveiled its first solar umbrella, designed to charge electronic devices — like coolers...

CES 2025 Press Day: Everything Nvidia, Sony, Toyota, Samsung announced, and more

CES 2025, the annual consumer tech conference held in Las Vegas, is here. TechCrunch reporters are on...

Samsung will unveil the Galaxy S25 on January 22

In what was an otherwise uneventful CES 2025 press conference, Samsung on Monday announced January 22 as...

Consumer tech spending will break records in 2025 if Trump’s tariffs don’t squash it, CTA predicts

American consumers will spend a record-breaking $537 billion on new apps, devices, and gadgets in 2025, according...

Sierra Space CEO Tom Vice steps down, CNBC reports

Tom Vice, the CEO of private space technologies company Sierra Space, has left the company, according to...

Flying Flea electric motorcycles will feature connected services powered by Qualcomm 

Flying Flea is the latest electric two-wheeler manufacturer to embrace the connected vehicle trend. At CES 2025,...

Google is forming a new team to build AI that can simulate the physical world

Google is forming a new team to work on AI models that can simulate the physical world....