Contact Us
Blog / What Is NVLink? How NVIDIA's High-Speed GPU Interconnect Shapes PCB Routing

What Is NVLink? How NVIDIA's High-Speed GPU Interconnect Shapes PCB Routing

Posted: June, 2026 Writer: NextPCB - S Share: NEXTPCB Official youtube NEXTPCB Official Facefook NEXTPCB Official Twitter NEXTPCB Official Instagram NEXTPCB Official Linkedin NEXTPCB Official Tiktok NEXTPCB Official Bksy

As AI models grow exponentially, the communication bottleneck between GPUs has become the most critical hurdle in data center performance. NVIDIA NVLink solves this by providing an ultra-high-speed, direct GPU-to-GPU interconnect that bypasses traditional PCIe limitations. However, routing NVLink signals across a printed circuit board (PCB) introduces immense challenges for hardware engineers, requiring advanced materials, strict impedance control, and complex via structures. In this guide, we explore what NVLink is, how its bandwidth has evolved, and what it takes to design and manufacture PCBs capable of handling these extreme data rates.

  1. Table of Contents
  2. What is NVIDIA NVLink? (The Basics)
  3. The Evolution of NVLink Bandwidth: From Pascal to Blackwell
  4. NVLink vs. PCIe: An Architectural Comparison
  5. Structural Diagram: PCIe Bus vs. NVLink Topology
  6. How NVLink Shapes High-Speed PCB Routing
  7. PCB Manufacturing Requirements for NVLink Boards
  8. Frequently Asked Questions (FAQ)
  9. Conclusion & Next Steps

NVIDIA NVLink is a proprietary, high-speed, direct GPU-to-GPU interconnect technology developed by NVIDIA. It was introduced to address a fundamental problem in high-performance computing (HPC) and artificial intelligence (AI): the Peripheral Component Interconnect Express (PCIe) bus is simply too slow to keep up with the massive data demands of modern GPUs.

In traditional architectures, if GPU A needs to share data with GPU B, the data must travel from GPU A, through the PCIe bus, to the CPU (or a PCIe switch), and then back down to GPU B. This creates a severe bottleneck. NVLink acts as a multi-lane, high-speed highway directly connecting the GPUs to each other, allowing them to share memory and work as a single massive accelerator. When combined with NVIDIA NVSwitch, NVLink can scale across entire server racks, creating massive GPU clusters.

To understand the routing challenges, we must look at how NVLink bandwidth has scaled over the years. With every new GPU architecture, NVIDIA has drastically increased the signaling speed and the number of NVLink links per GPU.

  • First Generation (Pascal - P100): Introduced in 2016, offering 160 GB/s bidirectional bandwidth per GPU.
  • Second Generation (Volta - V100): Increased to 300 GB/s using 25 Gb/s signaling rates.
  • Third Generation (Ampere - A100): Jumped to 600 GB/s per GPU, utilizing 50 Gb/s signaling with PAM4 (Pulse Amplitude Modulation 4-level) encoding.
  • Fourth Generation (Hopper - H100): Achieved an astonishing 900 GB/s bidirectional bandwidth, requiring complex H100 PCB design rules.
  • Fifth Generation (Blackwell - B200/GB200): The latest iteration delivers up to 1.8 TB/s bidirectional bandwidth per GPU, utilizing 224G PAM4 signaling.

Why not just use a faster version of PCIe? While PCIe Gen 5 and Gen 6 offer significant improvements, they still lag behind NVLink's raw throughput and are hampered by protocol overhead. Here is a comparison based on the current AI server landscape:

Feature PCIe Gen 5.0 (x16) NVIDIA NVLink (4th Gen - Hopper) NVIDIA NVLink (5th Gen - Blackwell)
Topology Tree topology (CPU-centric) Mesh / Direct P2P (GPU-centric) Mesh / Direct P2P (GPU-centric)
Max Bandwidth (Bidirectional) 128 GB/s 900 GB/s 1,800 GB/s (1.8 TB/s)
Signaling Type 32 GT/s NRZ 112G PAM4 224G PAM4
Primary Use Case CPU-to-GPU, Storage, NICs GPU-to-GPU Memory Sharing GPU-to-GPU Memory Sharing
PCB Routing Difficulty High Extreme Ultra-Extreme

Structural Diagram: PCIe Bus vs. NVLink Topology

Below is a simplified diagram illustrating the architectural difference between a traditional PCIe setup and an NVLink-enabled GPU topology.

TRADITIONAL PCIe ARCHITECTURE:
[ System Memory ]
       |
    [ CPU ]
       |
 [ PCIe Switch ]
   /       \
[GPU 1]  [GPU 2]  --> Communication requires CPU/Switch intervention.

NVLINK ARCHITECTURE (e.g., HGX Baseboard):
[ System Memory ]
       |
    [ CPU ]
       |
 [ PCIe Switch ] --- (For Control/Management)
   /       \
[GPU 1] === [GPU 2]
  ||          ||
[GPU 3] === [GPU 4]
(=== denotes High-Speed NVLink Direct Connections)

How NVLink Shapes High-Speed PCB Routing

Routing NVLink signals, especially 112G and 224G PAM4 used in modern AI servers, is one of the most demanding tasks in hardware engineering. The physical layer (PHY) requires flawless Signal Integrity (SI). Here is how NVLink impacts PCB design:

1. Managing Insertion Loss and Dielectric Absorption

At frequencies exceeding 50 GHz (needed for advanced NVLink), standard FR4 PCB materials act like sponges, absorbing the signal. Engineers must route NVLink traces using ultra-low-loss laminates (such as Rogers, Megtron 7/8, or Tachyon). The trace lengths must be kept absolutely minimal, which dictates the physical placement of GPUs on the OAM (Open Accelerator Module) or HGX baseboards.

2. PAM4 Encoding and Noise Sensitivity

Older interfaces used NRZ (Non-Return-to-Zero), which has two voltage levels (0 and 1). NVLink utilizes PAM4, which has four voltage levels (00, 01, 10, 11) within the same voltage swing. The "eye diagram" for PAM4 is only one-third the height of NRZ, making it incredibly sensitive to crosstalk, jitter, and impedance mismatches. Routing requires extreme spacing rules (often 3W or 5W) between differential pairs to prevent crosstalk.

3. Via Stub Optimization and Backdrilling

When routing NVLink signals from the top layer of a 30+ layer PCB to an internal stripline, the remaining portion of the via (the "stub") acts as an antenna, causing destructive signal reflections. To prevent this, PCB designs must employ extensive backdrilling (controlled depth drilling) to remove these stubs, or rely entirely on blind/buried microvias in an HDI (High Density Interconnect) stackup.

4. Surface Roughness (Copper Foil)

At NVLink frequencies, the "skin effect" causes the electrical current to travel only along the very outer surface of the copper trace. If the copper is rough (to help the substrate stick better), the signal path becomes longer and encounters more resistance. NVLink boards require HVLP (Hyper Very Low Profile) or perfectly smooth copper foils to maintain signal integrity.

PCB Manufacturing Requirements for NVLink Boards

Designing an NVLink board is only half the battle; manufacturing it requires top-tier fabrication capabilities. High-speed PCB manufacturing for NVLink typically demands:

  • High Layer Counts: HGX baseboards and OAM modules frequently require 24 to 40 layers to accommodate all ground planes, power delivery networks (PDN), and NVLink routing channels.
  • Tight Impedance Control: Tolerance must be kept within ±5% (compared to the standard ±10%), requiring high-precision etching and LDI (Laser Direct Imaging).
  • Any-Layer HDI: To route out of massive GPU BGA packages (which have thousands of pins), advanced laser-drilled via-in-pad structures are mandatory.

Frequently Asked Questions (FAQ)

Does NVLink replace PCIe?

No. NVLink and PCIe co-exist in AI servers. PCIe is still used for the connection between the CPU, NICs (Network Interface Cards like ConnectX), storage drives, and the baseboard. NVLink is dedicated exclusively to GPU-to-GPU memory pooling and data transfer.

What connector is used for NVLink?

This depends on the form factor. For PCIe add-in cards (like the RTX Ada Generation or dual-slot server cards), an NVLink bridge physically connects the tops of the cards. In enterprise AI servers (like DGX or HGX systems), NVLink is routed directly through the high-density PCB baseboard or via specialized mezzanine connectors.

Why is PAM4 used in modern NVLink?

PAM4 transmits two bits of data per clock cycle instead of one. This allows NVIDIA to double the data rate without doubling the physical clock frequency, which would cause unmanageable signal loss on the PCB.

Conclusion & Next Steps

NVIDIA NVLink is the backbone of modern AI computing, allowing GPUs to act as a unified brain for training massive Large Language Models (LLMs). However, moving data at terabytes per second turns the PCB itself into a critical bottleneck. Successfully implementing NVLink architecture requires a deep understanding of electromagnetics, ultra-low-loss materials, and flawless manufacturing execution.

Are you designing next-generation hardware that requires complex high-speed routing, HDI structures, and impedance control for AI accelerators?

Need to manufacture AI server PCBs? 

Tag: High-Speed PCB Design advanced pcb