Lolly Zheng- Sales Account Manager at NextPCB.com
Support Team
Feedback:
support@nextpcb.comThe artificial intelligence hardware landscape is evolving at a breakneck pace. As we transition deeper into the era of generative AI and massive large language models (LLMs), the power requirements for GPU accelerators are skyrocketing. Previously, high-performance data center chips operated within a Thermal Design Power (TDP) of 300W to 400W. Today, as detailed in our analysis of the NVIDIA Blackwell Architecture Explained: B200, GB200 & PCB Design Impact, single accelerators are pushing past the 1000W barrier.
At these extreme power densities, traditional air cooling is no longer physically viable. The volume of air required to dissipate 1000W of heat from a small silicon die exceeds the acoustic, spatial, and mechanical limits of standard server chassis. Enter the liquid cooling server. By leveraging the vastly superior heat capacity of liquid coolants, data centers can maintain optimal operating temperatures for next-generation hardware. For PCB layout engineers, hardware architects, and system integrators, transitioning to ai server liquid cooling introduces a paradigm shift in how printed circuit boards (PCBs) are designed, manufactured, and assembled.
This comprehensive guide explores the integration of cold plate cooling and manifold systems into AI servers, and dissects the rigorous board-level PCB design requirements necessary to support these advanced thermal solutions. For a broader overview of hardware infrastructure, you may also refer to What Is an AI Server? Architecture, Components & PCB Requirements.
To understand why liquid cooling has become a necessity, we must look at the physics of heat transfer. The fundamental limit of air cooling lies in the specific heat capacity of air compared to liquid coolants (typically treated water or specialized dielectric fluids). Water, for instance, has a thermal conductivity approximately 24 times greater than air, and a heat capacity by volume that is over 3,000 times greater.
For more foundational knowledge on conventional heat dissipation, check out our guide on Thermal Management on AI Server PCBs: Copper Coin, Thermal Vias and Heatsink Integration. However, when thermal vias and copper coins fall short, direct-to-chip (D2C) liquid cooling takes over.
| Feature | Traditional Air Cooling | Direct-to-Chip (D2C) Liquid Cooling |
|---|---|---|
| Cooling Medium | Conditioned Air | Propylene Glycol / Water Mix (PGW) or Dielectric Fluid |
| Max TDP Supported per Chip | ~500W to 600W (with massive heatsinks) | 1000W to 1500W+ |
| PCB Keepout Zones | Moderate (for heatsink fins and airflow paths) | High (for coolant tubes, cold plates, and manifold fittings) |
| Mechanical Stress on PCB | High (Heatsinks can weigh several kilograms) | Moderate to High (Weight of cold plate + coolant fluid + tube tension) |
| Risk of Leakage | None | Present (requires leak detection circuitry and blind-mate quick disconnects) |
| Rack Power Density | 10kW to 20kW per rack | 50kW to 100kW+ per rack |
In a liquid cooling pcb environment, the most common architecture is Direct-to-Chip (D2C) cooling, utilizing a component known as a cold plate. Unlike immersion cooling, where the entire PCB is submerged in a dielectric fluid, cold plate cooling targets specific high-power components—namely, the GPUs, CPUs, and high-bandwidth networking switches like those discussed in What Is NVSwitch? The Silicon Behind NVIDIA's GPU Cluster Scale-Out.
[ Coolant IN ] ------> [ Internal Micro-Skived Fins ] ------> [ Coolant OUT ]
|
[ Copper Base Plate ]
|
[ Thermal Interface Material (TIM) ]
|
[ AI GPU Silicon Die (e.g., H100 / B200 / MI300X) ]
|
[ Advanced Substrate / CoWoS ]
|
[ BGA Solder Balls ]
|
[ AI Server Printed Circuit Board (PCB) ]
The cold plate is typically machined from high-purity copper or aluminum. Inside the cold plate, micro-skived fins dramatically increase the surface area in contact with the flowing coolant. Heat generated by the silicon die passes through the Thermal Interface Material (TIM), into the copper base plate, and is absorbed by the liquid. The heated liquid then exits the cold plate and flows to a Heat Exchanger (CDU).
The extreme density of these silicon packages relies heavily on advanced packaging techniques. For a deeper dive into how these chips are constructed before they even touch the PCB, read CoWoS Packaging Explained: Why H100 and B200 GPUs Depend on Advanced 2.5D Packaging.
Integrating a cold plate onto a PCB is not as simple as swapping out a fan heatsink. The presence of a cold plate cooling system introduces strict electrical and layout constraints for PCB designers.
Cold plates require robust mechanical mounting to ensure uniform pressure across the silicon die. Insufficient pressure leads to poor TIM contact, causing thermal throttling, while excessive pressure can crack the delicate silicon die. The PCB must feature precise, non-plated through-holes (NPTH) for the mounting screws. Surrounding these holes are strict keepout zones where no components, and often no traces on the outer layers, can be placed. Furthermore, the routing of the inlet and outlet tubes requires spatial clearance, restricting the placement of tall components like capacitors and VRMs (Voltage Regulator Modules) near the GPU.
Because the cold plate physically blocks large areas of the top layer, placing VRMs close to the GPU becomes challenging. In high-power AI accelerators, the current can exceed 1000 Amperes. According to Joule's Law (P = I2R), any trace resistance will result in massive power loss and heat generation. Designers often have to utilize extremely thick copper planes and bury the PDN routing in the inner layers of the PCB. The mounting holes for the cold plate create "Swiss cheese" out of internal power and ground planes, increasing the loop inductance and degrading the PDN performance.
AI accelerators communicate via ultra-high-speed interconnects. For instance, you can learn how these signals are routed in our guides on What Is NVLink? How NVIDIA's High-Speed GPU Interconnect Shapes PCB Routing and 112G PAM4 PCB Design for AI Servers: Material Selection, Trace Routing and SI Rules. The large keepout areas required by cold plate tubing force high-speed signals to take longer detours. Designers must meticulously manage trace lengths, impedance (usually targeted at 85Ω or 100Ω differential), and skew matching while navigating around the mechanical obstacles of the liquid cooling hardware.
While rare, coolant leaks are catastrophic in a data center. Advanced AI server PCBs now incorporate built-in leak detection. This is often implemented as a serpentine trace of exposed, gold-plated copper (using ENIG surface finish) routed around the perimeter of the cold plate and manifold fittings. If conductive liquid bridges the gap between two parallel traces, the resistance drops, triggering a hardware interrupt that immediately cuts power to the server to prevent a short circuit.
The mechanical stress on a liquid cooling pcb is significantly different from an air-cooled board. While massive air heatsinks place static vertical weight on the board, liquid cooling systems introduce dynamic forces.
When a heavy cold plate is bolted down, and stiff coolant tubes apply lateral tension, the PCB is highly susceptible to bowing and twisting. If the PCB warps, the solder joints beneath the massive BGA (Ball Grid Array) packages can fracture. To mitigate this, PCBs for AI servers are exceptionally thick. It is common to see board thicknesses of 3.2mm or even 4.0mm, consisting of 24 to 30+ layers. For more on this, refer to Why AI GPUs Require 30+ Layer HDI PCBs.
Additionally, rigid metal stiffeners are often bolted to the back of the PCB to maintain absolute flatness, ensuring co-planarity across the BGA footprint. To understand the intricacies of soldering these massive packages, review BGA Assembly for AI Accelerator Cards: Challenges, Inspection and Quality Control.
Even though liquid cooling keeps the absolute temperature of the GPU lower, steep thermal gradients still exist between the cold plate, the silicon die, the organic substrate, and the FR4/Megtron PCB. Metals like copper have a CTE of around 16.5 ppm/°C, while the PCB substrate in the X-Y axis might be 12-14 ppm/°C. As the server powers up and down, these materials expand and contract at different rates. Over time, this CTE mismatch induces shear stress on the solder balls, leading to fatigue and eventual failure. Selecting PCB base materials with a tightly controlled CTE is a critical defense mechanism.
At the system level, an AI server is not just a single PCB; it is an ecosystem of interconnected boards. To see how these modules are standardized, read What Is an OAM Module? Open Accelerator Module Standard for AI Hardware.
When outfitting an OAM baseboard or a custom GPU carrier board with liquid cooling, individual cold plates must be connected to a central distribution point. This is achieved using a manifold.
The manifold is a rigid or semi-rigid pipe assembly that runs alongside the PCB. It acts as the main artery, splitting the incoming cold fluid into multiple parallel streams, sending one stream to each GPU cold plate, and then collecting the heated fluid into a return pipe. The manifold must be carefully engineered to ensure equal flow rates and pressure drops across all GPUs, ensuring that GPU #8 receives the same cooling capacity as GPU #1.
In modern AI server racks, maintenance must be fast and foolproof. When a technician slides a GPU tray into a server chassis, the fluid connections must mate automatically without leaking a single drop. This is achieved using Universal Quick Disconnects (UQDs). The PCB tray is designed with strict mechanical tolerances so that as the electrical connectors (like high-density PCIe or custom power headers) mate with the midplane, the UQDs simultaneously snap into the rack manifold. This level of system integration requires close collaboration between PCB layout engineers and mechanical CAD engineers. For a macro perspective on how racks are constructed, see GPU Rack Architecture: How AI Clusters Are Built from PCB to Rack Level.
The choice of substrate material is paramount when designing an ai server liquid cooling system. While the liquid cooling keeps the silicon cool, the PCB itself must still endure the rigorous reflow assembly process, the weight of the cooling apparatus, and the high-speed signal demands of modern AI.
Standard FR4 is completely inadequate for these applications. High-speed, low-loss laminates are mandatory. Materials such as Panasonic Megtron 6, Megtron 7, or Rogers laminates are frequently utilized. These materials offer:
For an exhaustive breakdown of substrate choices, refer to High-Speed PCB Materials for AI Servers: Rogers, Megtron, Panasonic and More.
Additionally, the surface finish of the PCB matters. For the immense BGA packages utilized in AI hardware, Electroless Nickel Immersion Gold (ENIG) or Electroless Nickel Electroless Palladium Immersion Gold (ENEPIG) provides the flattest possible surface for optimal solder joint reliability.
Not entirely. While direct-to-chip cold plates handle the bulk of the heat generated by GPUs and CPUs (which can account for 80% of the total system heat), other components like RAM, storage drives, and minor voltage regulators still rely on air convection. Therefore, most liquid-cooled servers are "hybrid" cooled, utilizing slow-spinning fans to cool the remaining components. For insights into memory layouts, see HBM vs GDDR7: Memory Architecture Choices and Their PCB Layout Implications.
Immersion cooling (where the entire PCB is submerged in fluid) offers excellent uniform cooling and eliminates fans entirely. However, it requires highly specialized tanks, makes maintenance messy and difficult, and requires PCBs to be manufactured without materials that could degrade in the dielectric fluid (such as certain plastics or adhesives). Cold plate cooling is currently the most widely adopted standard because it fits within traditional rack infrastructures.
Liquid cooling requires PCBs to be thicker and structurally stronger to handle mechanical stress. It also demands exceptionally precise drilling and routing to accommodate mounting holes and keepout zones. NextPCB utilizes state-of-the-art lamination and CNC profiling equipment to achieve the tight tolerances required. Read more about our process in How GPU PCBs Are Manufactured: From Bare Board to Final PCBA.
Due to the extremely high density of components pushed together by liquid cooling keepout zones, standard through-hole vias are rarely sufficient. Designers must rely heavily on Any-Layer HDI (High Density Interconnect), blind vias, and buried vias. This allows signals to be routed underneath dense BGA packages without consuming real estate on all layers.
The transition to liquid cooling in AI infrastructure is not merely a mechanical upgrade; it is a fundamental shift that dictates every aspect of board-level design. From routing high-speed traces around manifold fittings to ensuring structural integrity under the weight of cold plates, PCB engineers face unprecedented challenges.
To succeed, hardware developers need a manufacturing partner capable of executing extreme layer counts, ultra-tight tolerances, and advanced HDI structures flawlessly.
Need to manufacture liquid-cooled AI server PCBs? Get a quote from NextPCB →
Still, need help? Contact Us: support@nextpcb.com
Need a PCB or PCBA quote? Quote now