Stacy Lu
Support Team
Feedback:
support@nextpcb.comThe generative AI revolution, driven by Large Language Models (LLMs) and advanced neural networks, has fundamentally changed hardware engineering. At the heart of this computing leap are AI chips such as the NVIDIA H100, H200, and Blackwell-based B200 accelerators. However, these silicon marvels cannot function in isolation. They require a highly sophisticated, ultra-reliable foundation to operate: the AI accelerator PCB.
Designing an AI server PCB is drastically different from standard consumer electronics or traditional server motherboards. When dealing with thermal design power (TDP) exceeding 700W per module, data transfer rates hitting 112 Gbps PAM4, and dense interconnections compliant with PCI-SIG PCIe Gen 5/6 and NVIDIA NVLink specifications, the printed circuit board becomes the ultimate bottleneck. Every trace, every via, and every dielectric layer must be meticulously engineered to maintain signal integrity and manage extreme heat.
In this comprehensive guide, we will explore the critical aspects of GPU PCB design, diving deep into layer stackups, advanced substrate materials, power delivery, and the impact of CoWoS packaging. Whether you are building an OAM-compliant baseboard or custom AI hardware design, understanding these printed circuit board constraints is vital.
To understand the current state of GPU board design, it is essential to look at how we transitioned from standard PCIe expansion cards to custom module form factors. If you want a broader overview of the system architecture, you can read our guide on What Is an AI Server? Architecture, Components & PCB Requirements.
Historically, GPUs utilized 12 to 16 layers of standard FR-4 or mid-loss materials. However, as the demand for parallel processing grew, the bandwidth limitations of the PCIe bus drove the industry toward proprietary interconnects like NVIDIA's NVLink and open industry standards defined by the Open Compute Project (OCP), such as theOpen Accelerator Module (OAM).
The leap from architectures like Ampere to Hopper completely redefined the PCB. As detailed in our comparison of A100 vs H100: GPU Generational Leap & PCB Stack Differences Explained, today's AI accelerator baseboards (often housing 4 to 8 GPUs) act as massive networking switches just as much as they do compute platforms, demanding unprecedented High-Density Interconnect (HDI) PCB technology.
One of the most defining characteristics of an AI PCB design is its layer count. While a high-end gaming GPU might use a 14-layer board, an H100 SXM board and universal baseboards (UBB) frequently push beyond 24 layers, often reaching 30 to 40 layers.
=================== Layer 1: Top Signal (Connectors, Microvias) ------------------- Layer 2: GND Reference =================== Layer 3: High-Speed Signal (Stripline - NVLink) ------------------- Layer 4: GND Reference =================== Layer 5: Power Plane (VDD_CORE) ------------------- Layer 6: GND Reference ... [Layers 7 to 24: Mixed Signal, Power, GND] ... ------------------- Layer 25: GND Reference =================== Layer 26: Power Plane (VDD_HBM) ------------------- Layer 27: GND Reference =================== Layer 28: High-Speed Signal (Stripline - PCIe) ------------------- Layer 29: GND Reference =================== Layer 30: Bottom Signal (Bypass Caps, Routing)
This stackup demands exceptional lamination precision. The aspect ratio (the ratio of board thickness to drill hole diameter) can exceed 15:1. High-reliability fabrication practices must strictly follow standards such as IPC-6012 and advanced controlled depth drilling procedures.
At signal speeds of 112 Gbps (using PAM4 modulation), the dielectric constant (Dk) and dissipation factor (Df) of the PCB substrate dictate how much the signal degrades as it travels through the copper traces.
While conventional FR-4 alone is generally insufficient for long-reach 112G PAM4 channels, it is often combined with Ultra-Low Loss (ULL) materials in hybrid stackups. For example, a 30-layer board might use high-tier materials (like Megtron 6) for the critical high-speed inner layers, while utilizing standard FR-4 for outer layers handling power delivery and low-speed control signals, thereby balancing immense performance requirements with manufacturing costs.
| Material Grade | Example Brands/Series | Dk (@ 10GHz) | Df (@ 10GHz) | Application in AI Hardware |
|---|---|---|---|---|
| Mid-Loss / FR-4 | Isola 370HR, Panasonic Megtron 4 | 3.8 - 4.5 | 0.008 - 0.020 | Power distribution networks, management networks, hybrid stackup outer layers. |
| Ultra-Low Loss | Panasonic Megtron 6, Isola Tachyon | 3.4 - 3.6 | 0.002 - 0.004 | PCIe Gen 5, H100 baseboards, 56G/112G PAM4 routing. |
| Extreme-Low Loss | Panasonic Megtron 8, Rogers 3000 series | 3.0 - 3.2 | < 0.0015 | Emerging 224G PAM4 systems, next-gen PCIe Gen 6, high-frequency RF. |
Selecting the right high-speed PCB materials ensures that intricate networks formed by interconnect technologies can operate without unacceptable bit error rates (BER). Read more in our deep dive: What Is NVLink? How NVIDIA's High-Speed GPU Interconnect Shapes PCB Routing.
One cannot discuss modern H100 PCB or GPU PCB design without understanding what happens above the printed circuit board. The primary driver of PCB complexity in AI accelerators is the transition to 2.5D and 3D advanced packaging.
In architectures like Hopper and Blackwell, the GPU die does not sit directly on the PCB substrate. Instead, it relies on technologies like TSMC's CoWoS (Chip-on-Wafer-on-Substrate). The primary compute die and multiple stacks of High Bandwidth Memory (HBM3e or HBM4) are mounted on a silicon interposer. This allows massive data bandwidth (terabytes per second) between the GPU and memory without routing those signals through the main PCB.
However, this creates a formidable challenge for PCB layout engineers: Package Escape Routing. Because the interposer consolidates all external I/O, PCIe, NVLink, and power connections into an extremely dense array of bumps at the bottom of the IC substrate, the mating AI accelerator PCB requires incredibly tight via-in-pad rules, microscopic trace widths/spacing, and advanced Any-Layer HDI to successfully fan out these signals to the rest of the board.
When dealing with gigahertz frequencies per IEEE and PCI-SIG standards, PCB traces act as transmission lines. Controlled impedance must be strictly maintained (e.g., 85 Ω for PCIe Gen 5 and 100 Ω for Ethernet).
The Power Delivery Network (PDN) on an AI module is arguably its most critical system. A single OAM or SXM module can draw up to 700 watts. Because core voltages are typically below 1.0V, the aggregate current supplied to the GPU package can approach hundreds of amperes.
However, this current is not delivered over a single rail. AI servers utilize a distributed PDN architecture featuring independent power rails for the Core (VDD_CORE), Memory (VDD_HBM), and Auxiliary I/O. According to Ohm's Law (V = I × R), even 0.5 milliohms of PCB trace resistance causes unacceptable IR drops and massive heat generation at these current levels.
As components draw immense power, the PCB itself acts as a massive heat spreader. High temperatures can cause the PCB substrate to expand in the Z-axis, stressing via barrels and potentially causing micro-cracks. It also increases the material's Df, worsening signal loss.
Engineers must integrate robust thermal solutions directly into the bare board, including dense arrays of thermal vias beneath power stages, embedded copper coins for localized extreme heat, and specifying high-Tg materials (> 200°C) capable of surviving continuous data center thermal stress.
Designing a 30-layer, Any-Layer HDI NVLink PCB is useless if it cannot be reliably manufactured. Registration—the alignment of layers during lamination—is critical. If a thick board shifts by even a few mils, microvias will miss their target pads. Manufacturers must employ precise laser direct imaging (LDI), sequential lamination, and automated optical inspection (AOI) to ensure adherence to IPC Class 2, IPC Class 3, or other high-reliability manufacturing requirements where applicable.
As we look toward 2026 and 2027, the roadmap for AI hardware design is aggressively pushing the physical limits of copper and conventional substrates.
| Generation | Interconnect / Standard | Status / Timeline |
|---|---|---|
| PCIe Gen 5 | 32 GT/s (NRZ) | Mainstream Deployment |
| 112G PAM4 | Current AI Backplanes (NVLink/Ethernet) | High-Volume Production |
| PCIe Gen 6 | 64 GT/s (PAM4) | Early Adoption / Scaling |
| 224G PAM4 | Next-Generation AI Clusters | Emerging / Prototyping Phase |
To overcome the insertion loss limits of copper at 224G and beyond, the industry is moving toward Co-Packaged Optics (CPO) and Silicon Photonics. By moving the optical transceivers directly onto the package substrate alongside the GPU or switch ASIC, CPO drastically reduces the electrical trace length on the PCB, shifting the design burden from complex high-speed copper routing to managing thermal dissipation and optical fiber integration.
Standard GPUs typically use 10-14 layers with mid-loss materials. AI accelerator modules (like OAM or SXM) require 24-40 layers, utilize ultra-low loss materials in hybrid stackups, and feature extensive Any-Layer HDI to support complex package escape routing for CoWoS-based chips.
At high speeds (PCIe Gen 5, 112G PAM4), unused plated through-hole stubs act as antennas that reflect signals. Back-drilling removes these stubs, preventing destructive interference and maintaining signal integrity per PCI-SIG standards.
CoWoS moves the high-density HBM routing onto a silicon interposer, reducing memory routing on the main PCB. However, it consolidates all I/O and power into an ultra-dense BGA/land grid footprint, requiring advanced HDI and via-in-pad technologies for successful package escape routing.
Not for critical high-speed channels.
Most modern AI server PCBs use hybrid stackups combining FR-4 with ultra-low-loss materials such as Megtron 6 or Megtron 8 to meet PCIe Gen 5/6 and 112G PAM4 insertion loss requirements.
The transition to AI-centric data centers demands hardware fabrication that leaves zero room for error. Building an AI accelerator PCB requires a world-class manufacturing partner capable of meeting strict SI/PI requirements and advanced HDI tolerances.
NextPCB supports industry-leading engineering capabilities for AI hardware:
Don't let fabrication limitations bottleneck your next-generation compute clusters. Get an engineering review and quote from NextPCB → Let our engineering team help you bring your high-speed, high-density AI infrastructure to life.
Need help evaluating stackups, material selection, impedance budgets, or HDI manufacturability? Our engineering team can review your design files before production.
Still, need help? Contact Us: support@nextpcb.com
Need a PCB or PCBA quote? Quote now