Support Team
Feedback:
support@nextpcb.comWhen NVIDIA transitioned from the Ampere generation (A100) to the Hopper generation (H100), the performance numbers made headlines: roughly 3× the training throughput, double the NVLink bandwidth, and a jump to HBM3 memory. What received far less attention was what those improvements required from the printed circuit boards underneath.
The A100 and H100 are not separated by a minor process shrink or a clock speed bump. They represent a genuine generational leap in semiconductor architecture—one that cascades directly into board-level engineering. Layer counts increased. Material grades changed. Signal integrity rules tightened. Power delivery complexity grew. Thermal requirements pushed several cooling configurations past what air-cooling could reliably sustain.
This article examines the A100-to-H100 transition from the bottom of the PCB stack upward, explaining why engineers designing or manufacturing H100-based infrastructure cannot simply adapt A100 board designs—they must build something fundamentally different.
The NVIDIA A100 was introduced in 2020, built on TSMC's 7 nm process node (designated N7), and represented the first major AI-focused GPU architecture since Volta. It shipped in PCIe and SXM4 form factors and became the dominant AI training accelerator for most of the 2021–2023 period.
The H100 followed in 2022, built on TSMC's 4 nm class process (designated 4N), introducing the Hopper architecture with its dedicated Transformer Engine, fourth-generation NVTensor Cores with FP8 support, NVLink 4.0, and HBM3 memory. The H100 ships in PCIe (Gen5) and SXM5 form factors.
Two years separate their introductions; the PCB requirements that accompany them are separated by considerably more.
The A100 is built on the GA100 die, TSMC N7, with 54.2 billion transistors across 826 mm2. Key architectural features:
The H100 is built on the GH100 die, TSMC 4N, with 80 billion transistors across 814 mm2 (a smaller die than A100 at higher transistor density). Key architectural features:
| Specification | A100 SXM4 | H100 SXM5 | Delta |
|---|---|---|---|
| Architecture | Ampere | Hopper | — |
| Process node | TSMC N7 (7 nm class) | TSMC 4N (4 nm class) | ~2× transistor density |
| Die size | 826 mm2 | 814 mm2 | Similar area, higher density |
| Transistor count | 54.2 billion | 80 billion | +48% |
| FP16 / BF16 TFLOPS (dense) | 312 | 989 | ~3.2× |
| FP8 TFLOPS (dense) | Not supported | ~2,000 (with sparsity) | New capability |
| FP64 TFLOPS | 19.5 | 34 | +74% |
| Memory type | HBM2e | HBM3 | +68% bandwidth |
| Memory capacity | 80 GB | 80 GB | Equal (H200: 141 GB) |
| Memory bandwidth | 2.0 TB/s | 3.35 TB/s | +68% |
| NVLink generation | NVLink 3.0 | NVLink 4.0 | +50% bandwidth |
| NVLink bandwidth | 600 GB/s bidirectional | 900 GB/s bidirectional | +50% |
| NVLink links | 12 links | 18 links | +50% |
| PCIe generation | PCIe Gen4 ×16 | PCIe Gen5 ×16 | 2× per-lane throughput |
| PCIe bandwidth | ~64 GB/s | ~128 GB/s | 2× |
| TDP (SXM) | 400 W | 700 W | +75% |
| Form factor | SXM4 | SXM5 | Incompatible sockets |
| Cooling (DGX config) | Air or DLC | Air or DLC | DLC preferred at 700 W |
NVLink 3.0 in the A100 provides 600 GB/s total bidirectional bandwidth across 12 links, with each link carrying 50 GB/s. NVLink 4.0 in the H100 increases total bandwidth to 900 GB/s across 18 links—a 50% increase in aggregate bandwidth achieved by adding 6 additional links rather than increasing per-link speed.
For PCB designers, the per-link signaling rate of NVLink 4.0 is the critical parameter, not just the aggregate bandwidth number. NVLink 4.0 operates at 100 Gb/s per lane (NRZ signaling), compared to NVLink 3.0's ~50 Gb/s. This per-lane speed doubling is what demands different PCB materials and tighter signal integrity rules—the board must pass twice the frequency content with adequate margin.
The 18 links of NVLink 4.0 also require more PCB routing real estate than the 12 links of NVLink 3.0. In a DGX H100 baseboard routing all-to-all connections between 8 GPUs via 4 NVSwitch chips, the total number of NVLink differential pairs to be routed increases substantially, driving higher layer counts to avoid unacceptable crosstalk between parallel traces.
Both A100 and H100 (base) ship with 80 GB of on-package HBM, but the memory technology differs significantly:
| Parameter | HBM2e (A100 SXM4) | HBM3 (H100 SXM5) | Delta |
|---|---|---|---|
| Total bandwidth | 2.0 TB/s | 3.35 TB/s | +68% |
| Per-pin data rate | 3.6 Gb/s | 6.4 Gb/s | +78% |
| Bus width per stack | 1,024 bits | 1,024 bits | Equal |
| Stack height (max) | 8 Hi | 12 Hi | +50% capacity per stack |
| Voltage | 1.2 V | 1.1 V | Lower power per bit |
From a PCB design standpoint, HBM signals are routed entirely within the CoWoS package substrate (or, in earlier A100 designs, within the SXM module itself), and do not appear on the baseboard PCB as routable signals. The PCB must, however, supply the regulated power rails that feed HBM, and the tighter voltage tolerances of HBM3 (1.1 V ± 30 mV, versus HBM2e at 1.2 V ± 40 mV) translate to tighter noise and ripple budgets on the VDDQ power planes of the H100 baseboard.
The A100 uses PCIe Gen4 ×16 for its host CPU interface, providing approximately 64 GB/s of bandwidth. The H100 moves to PCIe Gen5 ×16, doubling this to approximately 128 GB/s.
PCIe Gen5 runs at 32 GT/s per lane using NRZ encoding—double the 16 GT/s of Gen4. The Nyquist frequency of a Gen5 lane is 16 GHz, compared to 8 GHz for Gen4. This frequency doubling has a directly measurable impact on PCB channel requirements:
The practical consequence: A100 PCIe traces routed on standard low-loss laminate may exceed the Gen5 insertion loss budget if the same material and routing geometry is retained for H100 baseboard designs. PCIe Gen5 signal layers on H100 boards require either a lower-loss laminate, reduced trace length, or both.
A100 SXM4 baseboards in DGX A100 configurations typically use 14–18 PCB layers. H100 SXM5 baseboards in DGX H100 configurations typically require 16–20 layers, with some designs reaching 24 layers in configurations that integrate NVSwitch routing directly on the baseboard rather than on a separate switch board.
The layer count increase is driven by three factors acting simultaneously:
The A100 baseboard operates with NVLink 3.0 at ~50 Gb/s per lane. Panasonic Megtron 6 (Df ~0.004 at 10 GHz) is broadly suitable for NVLink 3.0 signal routing at typical trace lengths of 10–20 cm on the baseboard.
The H100 baseboard must support NVLink 4.0 at 100 Gb/s per lane. At this speed, the channel insertion loss budget from GPU pad to NVSwitch pad becomes much tighter. Megtron 6 remains usable on some layers, but the NVLink 4.0 signal routing layers typically require Megtron 6E, Isola Tachyon 100G, or equivalent materials with Df in the 0.002–0.003 range at 10 GHz.
| Layer Function | A100 Baseboard Material | H100 Baseboard Material |
|---|---|---|
| NVLink signal layers | Megtron 6 (Df ~0.004) | Megtron 6E / Tachyon 100G (Df ~0.002–0.003) |
| PCIe signal layers | Megtron 6 | Megtron 6E or better |
| Power and ground planes | Megtron 6 or standard | Megtron 6 or standard |
| Copper foil grade | Low-profile (LP) | Very-low-profile (VLP) on NVLink 4.0 layers |
Smoother copper foil (VLP vs LP) reduces skin-effect losses at high frequencies. At NVLink 3.0 speeds, the difference between LP and VLP copper is small enough to be within the noise budget. At NVLink 4.0 speeds (100 Gb/s per lane), the additional loss contribution of LP vs VLP copper can consume enough of the insertion loss budget to make the difference between a passing and failing channel at the end of the trace.
The transition from NVLink 3.0 (A100) to NVLink 4.0 (H100) tightens every signal integrity specification:
| SI Parameter | A100 / NVLink 3.0 | H100 / NVLink 4.0 |
|---|---|---|
| Per-lane signaling rate | ~50 Gb/s | 100 Gb/s |
| Nyquist frequency | ~12.5 GHz | ~25 GHz |
| Differential impedance target | 100 Ω ± 10% | 100 Ω ± 5% |
| Intra-pair skew budget | < 10 ps | < 5 ps |
| Via stub tolerance | < 20 mils (backdrilling recommended) | < 10 mils (backdrilling required) |
| Near-end crosstalk (NEXT) | < −25 dB at 12 GHz | < −30 dB at 25 GHz |
| Far-end crosstalk (FEXT) | < −35 dB at 12 GHz | < −40 dB at 25 GHz |
The tightening of impedance tolerance from ±10% to ±5% has direct manufacturing implications: etching uniformity, dielectric thickness control, and registration accuracy all contribute to impedance variation, and the tighter spec requires closer process control throughout fabrication.
The 75% increase in TDP from A100 (400 W) to H100 (700 W) per GPU is the most straightforward power delivery challenge. But the H100's PDN requirements go beyond scaling the A100 design for higher current:
The TDP increase from 400 W to 700 W per GPU—a 75% increase—changes the thermal management calculus at the board level:
The SXM5 socket (H100) has higher pin count and finer pitch than SXM4 (A100), increasing BGA escape routing complexity. H100 baseboard designs more extensively use:
| PCB Design Parameter | A100 SXM4 Baseboard | H100 SXM5 Baseboard |
|---|---|---|
| Typical layer count | 14–18 | 16–24 |
| NVLink signal layers laminate | Megtron 6 (Df ~0.004) | Megtron 6E / Tachyon 100G (Df ~0.002–0.003) |
| Copper foil (NVLink layers) | Low-profile (LP) | Very-low-profile (VLP) |
| Differential impedance tolerance | 100 Ω ± 10% | 100 Ω ± 5% |
| Via backdrilling | Recommended on NVLink layers | Required on NVLink 4.0 and PCIe Gen5 layers |
| HDI type | 1+N+1 typical | 1+N+1 to 2+N+2; via-in-pad standard |
| GPU TDP | 400 W | 700 W |
| VCORE current per GPU | ~300–350 A | ~500 A+ |
| PDN target impedance | ~0.2 mΩ DC–100 MHz | < 0.15 mΩ DC–100 MHz |
| Thermal via pitch | 0.6–0.8 mm | 0.4–0.6 mm |
| Board material Tg | ≥ 150°C | ≥ 170–180°C |
| Cooling (DGX config) | Air or DLC | DLC strongly preferred |
The PCB design differences described above translate into concrete manufacturing process changes when transitioning from A100 to H100 baseboard production:
Organizations running A100-based infrastructure and planning an H100 upgrade should understand the hardware compatibility boundaries:
| Component | A100 Compatible with H100? | Notes |
|---|---|---|
| Server chassis | No (in most cases) | SXM4 and SXM5 sockets are physically incompatible; DGX H100 is a new chassis design |
| GPU baseboard PCB | No | Completely different design; SXM5 socket, NVLink 4.0 routing, PCIe Gen5, higher power delivery |
| Host CPU / motherboard | Partial | H100 requires PCIe Gen5 host; Gen4 CPUs technically functional but limit PCIe bandwidth |
| Power supply units | No (for DGX) | H100 DGX at 10.2 kW requires higher-capacity PSUs than A100 DGX at ~6.5 kW |
| Cooling infrastructure | Partial | Existing DLC loops can be reused if flow capacity is sufficient; new cold plates required for SXM5 |
| Network switches (InfiniBand) | Yes | ConnectX-7 NICs are compatible with existing NDR InfiniBand fabric |
| Software stack (CUDA, drivers) | Yes | H100 is fully backward-compatible with A100 CUDA code; driver update required |
The practical conclusion: transitioning from A100 to H100 is a server-level replacement, not a component upgrade. The GPU baseboard, chassis, and power delivery infrastructure must all change. Cooling infrastructure may be partially reused if it has adequate capacity for the higher heat load. The software stack is portable.
Is the A100 still worth deploying in 2026?
Yes, in specific contexts. The A100 remains cost-effective for fine-tuning workloads in the 7B–30B parameter range, for multi-tenant inference using MIG partitioning, and for organizations with tightly constrained budgets where the lower acquisition and infrastructure cost of A100-based systems outweighs the performance deficit relative to H100. The A100 is a mature, well-supported platform with extensive ecosystem tooling.
Can an H100 GPU be installed in an A100 server chassis?
No. The SXM5 and SXM4 sockets are physically incompatible. An H100 cannot be installed in a DGX A100 chassis without replacing the GPU baseboard PCB, which requires a chassis redesign in practice. H100 requires a purpose-built SXM5 baseboard.
Why does the H100 have a 75% higher TDP than the A100 but “only” 3× the training performance?
The performance improvement is architecture-driven, not just power-scaling. The Transformer Engine and FP8 support in Hopper deliver step-change improvements for transformer model training that are not available on Ampere at any power level. The 75% TDP increase reflects a denser, more capable GPU die (80B vs 54B transistors) operating at higher throughput—the performance/watt ratio improved significantly from A100 to H100.
What is the main PCB material change between A100 and H100 baseboard designs?
The most critical material change is on the NVLink signal routing layers. A100 baseboards can use Panasonic Megtron 6 (Df ~0.004) for NVLink 3.0 signals. H100 NVLink 4.0 signal layers require lower-loss materials—typically Megtron 6E or Isola Tachyon 100G (Df ~0.002–0.003)—because the higher per-lane signaling rate (100 Gb/s vs ~50 Gb/s) more than doubles the dielectric loss contribution over the same trace length.
Is backdrilling required for A100 or just recommended?
For A100 NVLink 3.0 traces, backdrilling is strongly recommended but not universally required—some board designs achieve adequate signal margins without it by controlling trace lengths and via stub depths. For H100 NVLink 4.0 traces, backdrilling is required; the via stub resonance at 25 GHz (NVLink 4.0 Nyquist) falls within the signal band and degrades channel performance below spec without stub removal.
Does H100 use a different NVSwitch chip than A100?
Yes. A100 uses NVSwitch 2.0 (with NVLink 3.0). H100 uses NVSwitch 3.0 (with NVLink 4.0), which also doubles the switch chip's total bandwidth. The NVSwitch 3.0 is a larger, higher-power chip that imposes its own PCB routing and power delivery requirements on the baseboard or switch board that carries it.
Whether you are producing A100-generation boards for continued deployment or designing new H100-based infrastructure, NextPCB provides the high-layer-count fabrication, low-loss laminate processing, HDI, backdrilling, and PCBA capabilities that AI server boards demand.
Related Articles:
Still, need help? Contact Us: support@nextpcb.com
Need a PCB or PCBA quote? Quote now