Support Team
Feedback:
support@nextpcb.comThe phrase “AI server” has become ubiquitous in technology discussions, but its meaning is often reduced to “a server with GPUs.” From a hardware engineering perspective, that definition falls far short. An AI server is a purpose-built computing system optimized for the unique demands of machine learning workloads—demands that differ fundamentally from those of traditional enterprise or cloud servers in terms of compute density, memory bandwidth, interconnect speed, power consumption, and thermal output.
Understanding those differences is essential for anyone involved in designing, manufacturing, or procuring the printed circuit boards that make AI servers function. This article breaks down AI server architecture from the ground up, with a focus on the PCB-level implications of each major subsystem.
An AI server is a high-performance computing system designed specifically to accelerate artificial intelligence workloads—primarily training and inference of deep learning models. Unlike general-purpose servers optimized for transaction throughput or web serving, AI servers are architected around one priority: maximizing the throughput of tensor operations (matrix multiplications) at scale.
Modern AI servers are characterized by:
Leading AI server platforms in 2026 include NVIDIA DGX H100, DGX H200, DGX B200, and the GB200 NVL72 rack-scale system, as well as AMD-based platforms using MI300X accelerators.
| Dimension | General-Purpose Server | AI Server |
|---|---|---|
| Primary processor | CPU (multi-core) | GPU / AI Accelerator (thousands of cores) |
| Memory architecture | DDR5 DIMM, ~100–800 GB/s bandwidth | HBM3e on-package, up to 8 TB/s per GPU |
| Internal bandwidth | PCIe Gen4/5 (64–128 GB/s) | NVLink 4.0/5.0 (900–1,800 GB/s) |
| Power per node | 300–600 W | 3,000–10,000+ W |
| Cooling requirement | Air cooling (most configurations) | Air or direct liquid cooling (mandatory for B200) |
| PCB layer count | 8–12 layers (typical server board) | 16–32+ layers (GPU baseboard) |
| PCB material | Standard FR4 | Low-loss laminates (Megtron 6/7, Rogers, Tachyon) |
| Design complexity | Moderate | Extremely high (HDI, controlled impedance, CoWoS integration) |
The implications for PCB manufacturers are significant: AI server boards represent some of the most technically demanding work in the PCB industry, requiring capabilities that go well beyond standard server board production.
The accelerator cards are the heart of an AI server. In modern configurations, these are NVIDIA H100, H200, or B200 GPUs in SXM form factor, or AMD MI300X accelerators. Each card is itself a complex PCB assembly—a multi-layer board with a GPU die (or dies, in the case of B200’s dual-die CoWoS package), HBM memory stacks, power management ICs, and thermal interface materials.
SXM-form-factor GPUs do not function as standalone add-in cards. They mount to a baseboard (also called a GPU baseboard or switch board) that provides power, NVLink routing, and PCIe host connectivity. This baseboard is one of the most complex PCBs in the entire server.
The host CPU (typically AMD EPYC or Intel Xeon) handles system management, I/O orchestration, and workloads that are poorly suited to massively parallel execution. In AI servers, the CPU is deliberately de-emphasized: a DGX H100, for example, pairs eight H100 GPUs with just two AMD EPYC CPUs.
The motherboard connects the CPU to the GPU baseboards via PCIe, manages system memory (DDR5 DIMMs), and provides network interfaces. In high-end AI servers, the motherboard and GPU baseboard are sometimes integrated into a single large-format PCB for minimum latency.
GPU-to-GPU communication is handled by NVLink (NVIDIA) or Infinity Fabric (AMD). At the cluster level, NVSwitch chips route NVLink traffic between all GPUs in a server node and, in rack-scale systems like the GB200 NVL72, between nodes.
NVSwitch chips are mounted on the baseboard PCB and require their own high-density routing between dozens of NVLink differential pairs. In a DGX H100, the baseboard contains four NVSwitch chips, each connected to all eight GPUs via NVLink 4.0 traces.
AI servers use two distinct memory pools:
AI servers typically use NVMe SSDs for local storage—primarily for storing model checkpoints and dataset shards during training runs. Storage is secondary to compute in AI workloads, and many training systems use distributed network-attached storage rather than local drives.
Power delivery in an AI server is an engineering challenge in its own right. A fully loaded DGX H100 consumes approximately 10.2 kW; a GB200 NVL72 rack system can exceed 120 kW. At the board level:
Thermal management determines whether the server can sustain full-throttle compute indefinitely or must throttle to stay within thermal limits. H100 and H200 servers commonly use:
B200-based systems operating at 1,000 W per GPU almost universally require direct liquid cooling. Board designs must integrate cold plate mounting structures and ensure adequate thermal contact across the GPU package surface.
A single AI server node contains multiple distinct PCB assemblies, each with different design requirements:
| PCB Type | Function | Key Design Requirements |
|---|---|---|
| GPU Baseboard | Mounts GPU modules, routes NVLink, provides PCIe host connection | 24–32 layers, low-loss laminates, HDI, controlled impedance |
| GPU Accelerator Card PCB | Carries GPU die, HBM stacks, power management | Advanced packaging substrate, ultra-fine features |
| Server Motherboard | CPU, DDR5, PCIe routing, BMC | 12–20 layers, DDR5 signal integrity, PCIe Gen5 |
| Power Board / PSU PCB | AC/DC conversion, 48 V bus distribution | Heavy copper, high-voltage isolation, thermal management |
| NVSwitch Board | GPU-to-GPU NVLink routing in rack-scale systems | Extreme layer count, all-layer HDI, ultra-low-loss material |
| Network Interface Card (NIC) | ConnectX-7 or BlueField-3 for cluster networking | PCIe Gen5, 400G SerDes routing |
| Management Board (BMC) | Out-of-band server management | Standard complexity, Ethernet, I2C/SMBUS |
NVLink 4.0 operates at 100 Gb/s per lane; NVLink 5.0 (B200) at 200 Gb/s per lane. PCIe Gen5 runs at 32 GT/s per lane; Gen6 at 64 GT/s using PAM4 encoding. At these data rates, every aspect of the PCB affects signal quality:
The PDN must deliver clean, stable voltage to GPU cores under rapidly changing load conditions. Key design parameters:
Sustained operation at 700–1,000 W per GPU requires board-level thermal features that go beyond component placement:
AI server baseboards are among the most complex PCBs manufactured today:
| Board Type | Typical Layer Count | Laminate Material | Via Technology |
|---|---|---|---|
| H100/H200 GPU Baseboard | 16–20 | Panasonic Megtron 6, Isola Tachyon 100G | Backdrilled through-holes, HDI laser vias |
| B200 GPU Baseboard | 24–32+ | Panasonic Megtron 7, Rogers 4350B | Any-layer HDI, via-in-pad, backdrilling |
| NVSwitch Board (GB200 NVL72) | 32–40+ | Megtron 7, Rogers ultra-low loss | Any-layer HDI, ELIC (Every Layer Interconnect) |
| Server Motherboard | 12–18 | Megtron 6 or equivalent | Standard through-hole + select HDI |
The fabrication of AI server PCBs requires capabilities that go beyond standard high-volume PCB production. Key manufacturing requirements include:
These requirements mean that AI server PCB production is concentrated among a small number of manufacturers with the capital equipment, process expertise, and quality systems to meet the necessary specifications consistently.
NextPCB supports the full range of PCB fabrication and assembly services required for AI server hardware development and production:
What makes an AI server different from a GPU server?
The terms are often used interchangeably. “GPU server” emphasizes the compute hardware; “AI server” describes the intended workload. Technically, all modern AI training and inference servers are GPU servers, but not all GPU servers are optimized specifically for AI—some use GPU compute for rendering, simulation, or HPC workloads.
How many GPUs does a typical AI server have?
The most common AI server configurations have 8 GPUs per node (e.g., DGX H100, DGX H200). Rack-scale systems like the GB200 NVL72 scale to 72 GPUs in a single rack by connecting multiple nodes through NVLink switches.
Why do AI server PCBs need so many layers?
High layer counts are driven by three factors: the need for multiple dedicated signal routing layers for NVLink and PCIe differential pairs; the need for multiple power and ground planes to support high-current GPU power delivery; and HDI via structures (build-up layers) required for dense BGA breakout routing under fine-pitch GPU and NVSwitch packages.
Can standard PCB manufacturers produce AI server boards?
Not reliably. The combination of high layer count, low-loss materials, HDI vias, backdrilling, copper coin inlays, and fine-pitch BGA assembly requires specialized process capabilities and quality systems that most standard PCB shops do not have.
What is the typical turnaround time for AI server PCB prototypes?
Complexity determines lead time. A 16-layer baseboard prototype with standard HDI typically takes 10–15 business days. A 32-layer board with any-layer HDI, backdrilling, and copper coin integration may require 20–30 business days for the first article.
What certifications should AI server PCB manufacturers hold?
IPC Class 3 (for high-reliability boards), ISO 9001 quality management, and UL certification for relevant board materials. Manufacturers serving hyperscale customers often additionally hold IATF 16949 or equivalent process control certifications.
From GPU baseboards to NVSwitch boards and server motherboards, NextPCB’s advanced PCB manufacturing capabilities cover the full range of AI server hardware requirements—high layer counts, low-loss materials, HDI, and complete PCBA services.
Related Articles:
Still, need help? Contact Us: support@nextpcb.com
Need a PCB or PCBA quote? Quote now