NVIDIA · March 2022
H100
PCIe
The NVIDIA H100 PCIe is a high-performance GPU designed for data centers, targeting AI, machine learning, and high-performance computing workloads. It is part of the Hopper architecture, offering significant improvements in performance and efficiency over its predecessors. The H100 PCIe variant is optimized for PCIe-based systems, providing flexibility in deployment across a wide range of server configurations.

Provider Marketplace
Compute Performance
Architecture
Memory & VRAM
Connectivity & Scaling
Virtualization
Power & Efficiency
Physical Design
Thermals & Cooling
Software Ecosystem
Server & Deployment
System Compatibility
Benchmarks & Throughput
Structured Sparsity
Supported (up to 2x vs dense)
Transformer Throughput
Supported (Transformer Engine)
Multi-GPU Scalability
Scaling Efficiency
Scaling Characteristics
Workload Readiness
LLM Training
The H100 PCIe, based on the Hopper architecture, is highly suitable for training large language models. It supports multi-node scalability and can handle models up to 400B+ parameters due to its high VRAM capacity and advanced interconnects.
LLM Inference
The H100 PCIe is highly efficient for inference tasks, offering excellent token-per-second performance and sufficient KV cache headroom, making it ideal for deploying large-scale language models.
Vision Training
With its advanced Tensor Cores and high memory bandwidth, the H100 PCIe excels in vision training tasks, providing significant speedups for large-scale image classification and object detection models.
Diffusion Models
The H100 PCIe is well-suited for diffusion models, benefiting from its high computational throughput and memory capacity, enabling efficient training and inference of complex generative models.
Multimodal AI
The H100 PCIe's architecture supports multimodal AI tasks effectively, leveraging its Tensor Cores for processing diverse data types and large datasets, making it ideal for applications like image-text models.
Reinforcement Learning
The H100 PCIe offers excellent performance for reinforcement learning, with its high throughput and ability to handle complex simulations and large state spaces efficiently.
HPC / Simulation
The H100 PCIe provides robust support for HPC simulations, with strong FP64 performance, making it suitable for scientific and engineering applications requiring high precision.
Scientific Computing
The H100 PCIe excels in scientific computing tasks, offering high double-precision performance and memory bandwidth, ideal for complex simulations and data analysis.
Edge Inference
The H100 PCIe is less suited for edge inference due to its higher power consumption and larger form factor, making it more appropriate for data center deployments.
Real-Time Serving
The H100 PCIe is highly capable for real-time AI serving, with its low latency and high throughput, making it ideal for deploying AI models in production environments.
Fine-Tuning
The H100 PCIe is highly efficient for full fine-tuning tasks, thanks to its large VRAM and advanced architecture, allowing for the fine-tuning of large models with minimal overhead.
LoRA Efficiency
The H100 PCIe is also efficient for LoRA fine-tuning, providing sufficient resources to handle parameter-efficient training methods effectively.
Market Authority
MLPerf Ranking
The NVIDIA H100 PCIe is officially listed in MLPerf Training v3.0 and Inference v3.1 results, with performance data published by NVIDIA and partners.
Cloud Adoption
NVIDIA has publicly confirmed H100 PCIe availability on Google Cloud, Microsoft Azure, and Amazon Web Services (AWS) as of late 2023.
Supercomputer Usage
The H100 PCIe is deployed in supercomputers such as the NVIDIA Eos and is listed in public documentation for systems like the Texas Advanced Computing Center's Lonestar6 and Oak Ridge National Laboratory's Frontier expansion nodes.
Research Citations
The H100 PCIe is cited in peer-reviewed papers and arXiv preprints from 2023 onward, particularly in large language model and HPC research.
Community Benchmarks
Community benchmarks for H100 PCIe are available on sites like MLPerf, Hugging Face forums, and independent blogs, though most public benchmarks focus on the SXM variant.
GitHub Support
Official support for H100 PCIe is present in major deep learning frameworks (PyTorch, TensorFlow, JAX) and libraries (NVIDIA cuDNN, CUDA 12.x), with explicit references in GitHub repositories and release notes.
Enterprise Cases
NVIDIA and partners (e.g., Dell, HPE) have published case studies highlighting H100 PCIe deployments in enterprise AI and HPC workloads.
Key Strengths
The H100 PCIe excels at AI training and inference, offering substantial performance gains in deep learning workloads due to its advanced tensor cores and high memory bandwidth. It is also well-suited for scientific simulations and data analytics, providing a versatile solution for complex computational tasks.
Limitations
While the H100 PCIe offers excellent performance, it lacks NVLink support, which can be a limitation for applications requiring high-speed inter-GPU communication. Additionally, its high power consumption may necessitate upgrades to power delivery systems in some data centers. Availability can be constrained due to high demand and production limitations.
Also in the Lineup
Expert Insight
The H100 represents a strategic leap in AI compute. When comparing cloud providers, consider not just the hourly rate, but also the interconnect bandwidth (InfiniBand/NVLink) and regional availability which can significantly impact total cost of ownership for large-scale training.