Blog / What Is an AI Server? Architecture, Components & PCB Requirements

What Is an AI Server? Architecture, Components & PCB Requirements

Q: What makes an AI server different from a GPU server?

The terms are often used interchangeably. 'GPU server' emphasizes the compute hardware; 'AI server' describes the intended workload. Technically, all modern AI training and inference servers are GPU servers, but not all GPU servers are optimized specifically for AI—some use GPU compute for rendering, simulation, or HPC workloads.

Posted: June, 2026 Last Updated: June, 2026 Writer: Stacy Lu Share:

The phrase “AI server” has become ubiquitous in technology discussions, but its meaning is often reduced to “a server with GPUs.” From a hardware engineering perspective, that definition falls far short. An AI server is a purpose-built computing system optimized for the unique demands of machine learning workloads—demands that differ fundamentally from those of traditional enterprise or cloud servers in terms of compute density, memory bandwidth, interconnect speed, power consumption, and thermal output.

Understanding those differences is essential for anyone involved in designing, manufacturing, or procuring the printed circuit boards that make AI servers function. This article breaks down AI server architecture from the ground up, with a focus on the PCB-level implications of each major subsystem.

Table of Contents
What Is an AI Server?
How AI Servers Differ from General-Purpose Servers
Core Components of an AI Server
PCB Types Inside an AI Server
PCB Design Challenges in AI Servers
Manufacturing Complexity
NextPCB Capabilities for AI Server Boards
FAQ

What Is an AI Server?

An AI server is a high-performance computing system designed specifically to accelerate artificial intelligence workloads—primarily training and inference of deep learning models. Unlike general-purpose servers optimized for transaction throughput or web serving, AI servers are architected around one priority: maximizing the throughput of tensor operations (matrix multiplications) at scale.

Modern AI servers are characterized by:

Multiple GPU or AI accelerator cards (typically 4, 8, or 16 per server node)
High-bandwidth GPU-to-GPU interconnects (NVLink, Infinity Fabric)
High-bandwidth memory directly on the accelerator package (HBM2e, HBM3, HBM3e)
PCIe Gen4 or Gen5 host interface between CPU and GPU
100G or 400G network interfaces for cluster communication
Power consumption ranging from 3 kW to over 120 kW per rack

Leading AI server platforms in 2026 include NVIDIA DGX H100, DGX H200, DGX B200, and the GB200 NVL72 rack-scale system, as well as AMD-based platforms using MI300X accelerators.

How AI Servers Differ from General-Purpose Servers

Dimension	General-Purpose Server	AI Server
Primary processor	CPU (multi-core)	GPU / AI Accelerator (thousands of cores)
Memory architecture	DDR5 DIMM, ~100–800 GB/s bandwidth	HBM3e on-package, up to 8 TB/s per GPU
Internal bandwidth	PCIe Gen4/5 (64–128 GB/s)	NVLink 4.0/5.0 (900–1,800 GB/s)
Power per node	300–600 W	3,000–10,000+ W
Cooling requirement	Air cooling (most configurations)	Air or direct liquid cooling (mandatory for B200)
PCB layer count	8–12 layers (typical server board)	16–32+ layers (GPU baseboard)
PCB material	Standard FR4	Low-loss laminates (Megtron 6/7, Rogers, Tachyon)
Design complexity	Moderate	Extremely high (HDI, controlled impedance, CoWoS integration)

The implications for PCB manufacturers are significant: AI server boards represent some of the most technically demanding work in the PCB industry, requiring capabilities that go well beyond standard server board production.

Core Components of an AI Server

1. GPU Accelerator Cards

The accelerator cards are the heart of an AI server. In modern configurations, these are NVIDIA H100, H200, or B200 GPUs in SXM form factor, or AMD MI300X accelerators. Each card is itself a complex PCB assembly—a multi-layer board with a GPU die (or dies, in the case of B200’s dual-die CoWoS package), HBM memory stacks, power management ICs, and thermal interface materials.

SXM-form-factor GPUs do not function as standalone add-in cards. They mount to a baseboard (also called a GPU baseboard or switch board) that provides power, NVLink routing, and PCIe host connectivity. This baseboard is one of the most complex PCBs in the entire server.

2. Host CPU and Motherboard

The host CPU (typically AMD EPYC or Intel Xeon) handles system management, I/O orchestration, and workloads that are poorly suited to massively parallel execution. In AI servers, the CPU is deliberately de-emphasized: a DGX H100, for example, pairs eight H100 GPUs with just two AMD EPYC CPUs.

The motherboard connects the CPU to the GPU baseboards via PCIe, manages system memory (DDR5 DIMMs), and provides network interfaces. In high-end AI servers, the motherboard and GPU baseboard are sometimes integrated into a single large-format PCB for minimum latency.

3. High-Speed Interconnects

GPU-to-GPU communication is handled by NVLink (NVIDIA) or Infinity Fabric (AMD). At the cluster level, NVSwitch chips route NVLink traffic between all GPUs in a server node and, in rack-scale systems like the GB200 NVL72, between nodes.

NVSwitch chips are mounted on the baseboard PCB and require their own high-density routing between dozens of NVLink differential pairs. In a DGX H100, the baseboard contains four NVSwitch chips, each connected to all eight GPUs via NVLink 4.0 traces.

4. Memory Subsystem

AI servers use two distinct memory pools:

HBM (High Bandwidth Memory): Stacked on the GPU package via CoWoS interposer technology. HBM3e on H200 delivers 4.8 TB/s per GPU; HBM3e on B200 delivers 8.0 TB/s. This memory is not user-upgradeable—it is part of the GPU package assembly.
System DRAM (DDR5): Installed as DIMMs on the motherboard, used for the host CPU and for staging data before transfer to GPU memory. Typical AI servers have 512 GB to 2 TB of system DRAM.

5. Storage

AI servers typically use NVMe SSDs for local storage—primarily for storing model checkpoints and dataset shards during training runs. Storage is secondary to compute in AI workloads, and many training systems use distributed network-attached storage rather than local drives.

6. Power Delivery

Power delivery in an AI server is an engineering challenge in its own right. A fully loaded DGX H100 consumes approximately 10.2 kW; a GB200 NVL72 rack system can exceed 120 kW. At the board level:

High-current VRMs (Voltage Regulator Modules) convert 48 V bus power to the 0.8–1.2 V required by GPU cores
Power planes must handle 600–1,000 A of current for each GPU
Transient response requirements are stringent: GPU compute loads change in microseconds, and the PDN must prevent voltage droops that would trigger throttling

7. Cooling System

Thermal management determines whether the server can sustain full-throttle compute indefinitely or must throttle to stay within thermal limits. H100 and H200 servers commonly use:

High-flow air cooling with multiple large fans per chassis
Direct liquid cooling (DLC) with cold plates on GPU packages

B200-based systems operating at 1,000 W per GPU almost universally require direct liquid cooling. Board designs must integrate cold plate mounting structures and ensure adequate thermal contact across the GPU package surface.

PCB Types Inside an AI Server

A single AI server node contains multiple distinct PCB assemblies, each with different design requirements:

PCB Type	Function	Key Design Requirements
GPU Baseboard	Mounts GPU modules, routes NVLink, provides PCIe host connection	24–32 layers, low-loss laminates, HDI, controlled impedance
GPU Accelerator Card PCB	Carries GPU die, HBM stacks, power management	Advanced packaging substrate, ultra-fine features
Server Motherboard	CPU, DDR5, PCIe routing, BMC	12–20 layers, DDR5 signal integrity, PCIe Gen5
Power Board / PSU PCB	AC/DC conversion, 48 V bus distribution	Heavy copper, high-voltage isolation, thermal management
NVSwitch Board	GPU-to-GPU NVLink routing in rack-scale systems	Extreme layer count, all-layer HDI, ultra-low-loss material
Network Interface Card (NIC)	ConnectX-7 or BlueField-3 for cluster networking	PCIe Gen5, 400G SerDes routing
Management Board (BMC)	Out-of-band server management	Standard complexity, Ethernet, I2C/SMBUS

PCB Design Challenges in AI Servers

Signal Integrity at High Speeds

NVLink 4.0 operates at 100 Gb/s per lane; NVLink 5.0 (B200) at 200 Gb/s per lane. PCIe Gen5 runs at 32 GT/s per lane; Gen6 at 64 GT/s using PAM4 encoding. At these data rates, every aspect of the PCB affects signal quality:

Dielectric loss: Standard FR4 has a dissipation factor (Df) of ~0.020—acceptable for PCIe Gen3 but unacceptable for NVLink 5.0. Low-loss laminates with Df < 0.003 are required for high-speed routing layers.
Via stubs: Through-hole vias create stubs that resonate at high frequencies and degrade signal integrity. Backdrilling (controlled-depth drilling to remove the stub) is standard practice on NVLink and PCIe Gen5+ traces.
Differential pair matching: Intra-pair skew must be held below 5 ps; inter-pair skew within a lane group to < 50 ps. This requires precise trace length tuning across all routing layers.
Reference plane continuity: Every high-speed trace must have an unbroken reference plane directly above or below it. Plane splits, connector cutouts, and via fields that disrupt the return path cause reflections and emissions.

Power Delivery Network

The PDN must deliver clean, stable voltage to GPU cores under rapidly changing load conditions. Key design parameters:

Target impedance at the GPU package is typically < 0.1 mΩ from DC to 100 MHz
VRM placement within 20–40 mm of the GPU package to minimize plane inductance
Bulk capacitance (100–470 μF), mid-frequency capacitance (10–47 μF), and high-frequency decoupling (100 nF–1 μF) must be distributed across the board in a tiered approach
Copper plane thickness of 2–3 oz (70–105 μm) on power and ground layers to minimize resistance

Thermal Management

Sustained operation at 700–1,000 W per GPU requires board-level thermal features that go beyond component placement:

Thermal vias: Arrays of vias under high-power components (VRMs, GPU mounting area) transfer heat from top-side components through the board to copper ground planes and heat spreaders. Via pitch as tight as 0.4–0.5 mm is common.
Copper coin inserts: Solid copper blocks embedded in the PCB under GPU packages provide a low-resistance thermal path to cooling structures. This requires precise cavity routing during fabrication.
Board material Tg: Sustained operation near high-power components elevates local PCB temperature. Materials with Tg > 180°C are preferred to prevent delamination over thousands of thermal cycles.

Layer Count and Material Selection

AI server baseboards are among the most complex PCBs manufactured today:

Board Type	Typical Layer Count	Laminate Material	Via Technology
H100/H200 GPU Baseboard	16–20	Panasonic Megtron 6, Isola Tachyon 100G	Backdrilled through-holes, HDI laser vias
B200 GPU Baseboard	24–32+	Panasonic Megtron 7, Rogers 4350B	Any-layer HDI, via-in-pad, backdrilling
NVSwitch Board (GB200 NVL72)	32–40+	Megtron 7, Rogers ultra-low loss	Any-layer HDI, ELIC (Every Layer Interconnect)
Server Motherboard	12–18	Megtron 6 or equivalent	Standard through-hole + select HDI

Manufacturing Complexity

The fabrication of AI server PCBs requires capabilities that go beyond standard high-volume PCB production. Key manufacturing requirements include:

Layer registration accuracy: ±50 μm or better across 30+ layer stackups to ensure via alignment and controlled impedance consistency
Laser drilling for microvias: HDI designs require laser-drilled vias as small as 75–100 μm diameter; sequential lamination adds build-up layers that require multiple press cycles
Controlled depth backdrilling: Stub removal on high-speed vias requires drilling to a controlled depth (±50–75 μm), necessitating CNC machines with depth feedback
Copper coin integration: Milling precise cavities for copper inserts, pressing and bonding the copper, then finishing to maintain planarity for GPU module mounting
Advanced BGA assembly: GPUs and NVSwitch chips are large, high-pin-count BGAs requiring X-ray inspection, vacuum reflow profiles, and post-assembly board-level reliability testing
Via-in-pad with epoxy fill: Conductive or non-conductive epoxy fill and planarization of vias under BGA pads to ensure solder joint integrity under fine-pitch packages

These requirements mean that AI server PCB production is concentrated among a small number of manufacturers with the capital equipment, process expertise, and quality systems to meet the necessary specifications consistently.

NextPCB Capabilities for AI Server Boards

NextPCB supports the full range of PCB fabrication and assembly services required for AI server hardware development and production:

High-layer-count fabrication up to 40+ layers
Low-loss laminate processing: Panasonic Megtron 6/7, Isola Tachyon, Rogers series
HDI and any-layer HDI with laser-drilled microvias
Controlled depth backdrilling for via stub removal
Copper coin integration for thermal management
BGA assembly and X-ray inspection
Via-in-pad with epoxy fill and planarization
Full PCBA services from bare board to box build

FAQ

What makes an AI server different from a GPU server?
The terms are often used interchangeably. “GPU server” emphasizes the compute hardware; “AI server” describes the intended workload. Technically, all modern AI training and inference servers are GPU servers, but not all GPU servers are optimized specifically for AI—some use GPU compute for rendering, simulation, or HPC workloads.

How many GPUs does a typical AI server have?
The most common AI server configurations have 8 GPUs per node (e.g., DGX H100, DGX H200). Rack-scale systems like the GB200 NVL72 scale to 72 GPUs in a single rack by connecting multiple nodes through NVLink switches.

Why do AI server PCBs need so many layers?
High layer counts are driven by three factors: the need for multiple dedicated signal routing layers for NVLink and PCIe differential pairs; the need for multiple power and ground planes to support high-current GPU power delivery; and HDI via structures (build-up layers) required for dense BGA breakout routing under fine-pitch GPU and NVSwitch packages.

Can standard PCB manufacturers produce AI server boards?
Not reliably. The combination of high layer count, low-loss materials, HDI vias, backdrilling, copper coin inlays, and fine-pitch BGA assembly requires specialized process capabilities and quality systems that most standard PCB shops do not have.

What is the typical turnaround time for AI server PCB prototypes?
Complexity determines lead time. A 16-layer baseboard prototype with standard HDI typically takes 10–15 business days. A 32-layer board with any-layer HDI, backdrilling, and copper coin integration may require 20–30 business days for the first article.

What certifications should AI server PCB manufacturers hold?
IPC Class 3 (for high-reliability boards), ISO 9001 quality management, and UL certification for relevant board materials. Manufacturers serving hyperscale customers often additionally hold IATF 16949 or equivalent process control certifications.

Need to Manufacture AI Server PCBs?

From GPU baseboards to NVSwitch boards and server motherboards, NextPCB’s advanced PCB manufacturing capabilities cover the full range of AI server hardware requirements—high layer counts, low-loss materials, HDI, and complete PCBA services.

Get a quote from NextPCB →

Related Articles:

About the Author

Stacy Lu

With extensive experience in the PCB and PCBA industry, Stacy has established herself as a professional and dedicated Key Account Manager with an outstanding reputation. She excels at deeply understanding client needs, delivering effective and high-quality communication. Renowned for her meticulousness and reliability, Stacy is skilled at resolving client issues and fully supporting their business objectives.

1433 0 0 1 Facebook Twitter Linked In