Data center services provider Salute has launched a new operational service for direct-to-chip liquid cooling, aiming to address the intense thermal challenges created by artificial intelligence and high-performance computing workloads. The service, announced at Nvidia GTC in Washington D.C., is designed to provide data center operators with a standardized framework for managing the complexities of high-density liquid cooling, a technology that is rapidly becoming essential for the latest generation of powerful processors. Early adopters of the service include Applied Digital, Compass Datacenters, and SDC, who will use it to support an estimated 260 megawatts of data center capacity in the coming months.
The explosive growth of AI has pushed modern processors, particularly GPUs, to power densities that far exceed what traditional air-cooling systems can handle. AI workloads require immense computational power, leading to significant heat generation that can degrade hardware performance, shorten equipment lifespan, and cause costly downtime if not managed effectively. As rack power densities climb, in some cases exceeding 100kW, liquid cooling has emerged as a necessary solution. Direct-to-chip cooling, which applies liquid coolant directly to the hottest components, offers a highly efficient method of heat removal, but it also introduces new operational risks, including potential leaks and the need for specialized staff training. Salute’s new service aims to mitigate these risks by providing a comprehensive model for deploying and managing these advanced cooling systems at scale.
The Soaring Thermal Challenge in AI Infrastructure
The computational demands of artificial intelligence are fundamentally reshaping data center design, with thermal management emerging as a primary obstacle. AI and machine learning models require specialized hardware, such as GPUs and other accelerators, that can draw as much as 1000W per chip. This concentration of power results in extreme heat densities that traditional air-cooling methods are ill-equipped to handle. While effective for lower power loads, air cooling becomes inefficient and expensive as heat loads increase, struggling to maintain the stable temperatures required for optimal performance.
Failure to adequately manage this heat has severe consequences. Prolonged exposure to high temperatures can accelerate the degradation of sensitive electronic components, leading to premature hardware failures. More immediately, modern processors are designed to automatically reduce their processing speed—a process known as throttling—to prevent overheating. This directly compromises computational performance and reduces the efficiency of the entire system. For mission-critical AI applications, such performance bottlenecks can translate to significant financial and operational losses. Furthermore, ineffective cooling systems consume more energy, driving up operational costs and contributing to a larger environmental footprint, a growing concern for the industry.
Direct-to-Chip Cooling as a Necessary Evolution
To overcome the limitations of air cooling, the data center industry is increasingly turning to liquid-based solutions, with direct-to-chip (DTC) cooling being a leading approach for high-density environments. This technology involves mounting cold plates directly onto the primary heat-generating components on a server’s motherboard, such as CPUs and GPUs. A specialized coolant circulates through channels within these plates, absorbing heat with high efficiency before carrying it away from the sensitive hardware.
How DTC Systems Operate
The heated liquid flows out of the server racks to a coolant distribution unit (CDU), which acts as the central hub of the system. The CDU manages the flow rate and temperature of the coolant, using a heat exchanger to transfer the thermal energy from the primary coolant loop to a secondary facility water loop, which then dissipates the heat outside the data center. This method is far more effective at heat transfer than air, allowing data centers to support much higher rack densities while improving energy efficiency. By targeting heat at its source, DTC systems minimize the need for the large, power-hungry fans and chillers that dominate air-cooled facilities.
Single-Phase vs. Two-Phase Cooling
DTC systems primarily use one of two processes. In single-phase cooling, the coolant remains in its liquid state throughout the entire cycle, which offers a reliable and predictable method of thermal management. Two-phase cooling, a more advanced technique, allows the coolant to boil and turn into a vapor as it absorbs heat from the chip. This phase change allows it to absorb significantly more energy, offering even greater cooling potential. However, both approaches require a robust and meticulously managed operational framework to prevent leaks and ensure system integrity, as any failure could be catastrophic for the electrical equipment.
Salute’s Comprehensive Operational Framework
Salute’s new service is designed to provide the specialized expertise and standardized procedures necessary to manage DTC systems effectively. Recognizing that the technology requires a new operational model, the company has developed a multi-faceted offering to de-risk the transition to liquid cooling for data center operators. The service begins with a detailed design and operational assessment, which creates a customized operational plan tailored to the specific design and infrastructure of each facility.
A core component of the service is commissioning support, which ensures that all systems are installed and configured correctly to support AI and HPC operations as intended from day one. Salute also provides extensive training programs, including classroom instruction, online learning modules, and laboratory certification to upskill data center teams. This is supplemented by ongoing operational support to help clients scale their teams and environments. Furthermore, the company maintains a continuously updated library of best practices derived from its work with leading technology partners, including Nvidia, coolant manufacturers, and hyperscale cloud providers.
Addressing Risks and Ensuring Scalability
While DTC cooling is a powerful solution, its implementation brings unique challenges that can endanger the substantial investments made in AI hardware. Even brief interruptions in the coolant flow can cause rapid temperature spikes, potentially damaging processors. Moreover, leaks pose a serious hazard, risking expensive equipment failures and creating safety risks for personnel. Salute’s operational model directly confronts these issues by providing meticulously documented procedures for every aspect of DTC operations.
The framework includes detailed emergency protocols, standard operating procedures, and preventative maintenance schedules. It covers critical areas such as coolant chemistry management, leak detection and management, safety protocols, and risk mitigation strategies. According to Salute, these procedures are written by experts with experience managing some of the world’s largest supercomputers. This systematic approach is designed to ensure that data center staff are prepared to handle any contingency, thereby protecting investments and ensuring uptime. The model’s early success is reflected in its adoption by partners who are projected to scale their DTC-supported capacity from 260MW to an estimated 3,300MW by the end of 2027, a significant validation of the service’s impact.
Early Adoption and Industry Validation
The immediate adoption of Salute’s service by prominent data center providers underscores the pressing need for standardized DTC operations. Applied Digital, a company focused on building digital infrastructure for AI, has integrated the service into its facilities. Laura Laltrello, the company’s Chief Product Officer, stated that high-density environments require a completely new operational model and that partnering with Salute allows them to deliver world-class operations that mitigate risk and protect customer investments.
Similarly, SDC has implemented the model across multiple sites globally to support its customers’ expanding AI and HPC deployments. SDC founder Walter Wang noted that the service solves a critical problem by enabling accelerated AI deployments with zero downtime. He highlighted Salute’s ability to perform at scale under rapid timelines as a crucial factor for clients who are growing their AI operations worldwide. According to John Shultz, Salute’s Chief Product Officer for AI, the rapid adoption by these partners proves the service is a “game changer” that enables companies to expand their AI operations to meet soaring customer demand.