GPU Cloud Provider · Seattle, WA, USA

AWS

AWS offers G4 instances optimized for machine learning inference and graphics-intensive applications. These instances provide cost-effective options using NVIDIA or AMD GPUs, specifically tailored for tasks like AI model deployment, gaming, and high-end graphics rendering.

View 1 GPU

GPUs

Founded

2006

Countries

Data Centers

Uptime SLA

99.99%

Team Size

10,000+

GPU Marketplace

NVIDIA GB200 NVL4On-Demand

$42.00/hour

Specs Deploy

Company Profile

Company TypeHyperscaler

Provider TypeHyperscaler

Founded2006

HeadquartersSeattle, WA, USA

Legal EntityAmazon Web Services, Inc.

Parent CompanyAmazon.com, Inc.

FundingPublic (NASDAQ: AMZN)

Total RaisedNot applicable (subsidiary of Amazon)

Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5.48xlarge), NVIDIA H200 (p5e instances), NVIDIA A100 80GB SXM (p4de.24xlarge), NVIDIA A100 40GB (p4d.24xlarge), NVIDIA V100 (p3 instances), NVIDIA T4 (g4dn instances), NVIDIA L4 (g6 instances), NVIDIA L40S (g6e instances), AWS Trainium (trn1 instances), AWS Inferentia2 (inf2 instances)

Network Fabric100 Gbps Ethernet, Elastic Fabric Adapter

ConnectivityUp to 100 Gbps

StorageLocal NVMe-based SSD, Amazon EBS (Elastic Block Store), Amazon S3 (Simple Storage Service)

Data Center TierTier 3+ equivalent; proprietary AWS data centers with custom power and cooling infrastructure

Bare MetalYes, via EC2 bare metal instances (e.g., p4de.24xlarge, p5.48xlarge)

AvailabilityGA (Generally Available)

EnterpriseStartupResearchGovernmentEducation

Compute & Deployment

On-DemandYes

Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, market-based pricing)

Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options, Savings Plans also available)

Bare MetalYes (EC2 Bare Metal instances available for select GPU instance types)

VM-BasedYes (EC2 GPU instances: P3, P4, P5, G4, G5, G6, Trn1, Inf2)

Container-BasedYes (Docker via ECS, ECR, and EKS; Fargate for serverless containers)

KubernetesYes (managed K8s via Amazon EKS, with GPU node group support and Neuron device plugin)

Serverless GPUYes (AWS Inferentia via SageMaker Serverless Inference; limited GPU serverless options)

Spin-Up Time2-5 minutes for standard GPU instances; can vary to 10+ minutes for large P5 instances or during capacity constraints

TerraformYes (official HashiCorp-registry provider: hashicorp/aws, actively maintained by AWS and HashiCorp)

GPU Hardware

Latest GenH100 SXM, H100 PCIe, L40S, Trainium2

Legacy SupportA100, V100, T4, A10G, K80

Multi-GPU NodesYes (up to 8x per node)

Max GPUs/Node8

NVLinkYes (NVLink 4.0 on SXM nodes)

InfiniBandYes (EFA with 3,200 Gbps aggregate on p5 instances)

PCIe vs SXMBoth PCIe and SXM

HGX PlatformYes (HGX H100 8-GPU on p5 instances)

Liquid CoolingSelect SKUs (p5 instances with H100 SXM)

Pricing Model

Per HourYes (primary billing unit)

Per MinutePer-second billing (minimum 60 seconds)

SubscriptionYes (Reserved Instances: 1-year and 3-year terms)

Reserved DiscountUp to 72% off with 3-year all-upfront Reserved Instance commitment

Spot DiscountUp to 90% off on-demand with Spot Instances

Public PricingYes

Hidden FeesIP address charges ($0.005/hr for public IPv4), EBS storage not included in instance price, data transfer between AZs ($0.01/GB each way)

Egress Charges$0.09/GB for first 10TB/month, tiered pricing down to $0.05/GB over 150TB; free within same region

Pay-as-you-goYes

Credit SystemYes (AWS Credits via promotional programs and enterprise agreements)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with NCCL and AWS ParallelCluster/EFA)

Max Cluster Size20,000+ GPUs (via UltraClusters with P4d/P5 instances)

Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups and SageMaker)

Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling and SageMaker endpoint scaling)

InfiniBandYes (EFA with 3200 Gbps all-to-all bandwidth on P5 UltraClusters; HDR InfiniBand on P4d instances at 400 Gbps per node)

NVSwitchYes (on P4d and P5 SXM instances with NVSwitch for intra-node GPU communication)

SLA99.99%

Perf IsolationYes (dedicated bare metal via EC2 bare metal instances; P4d and P5 are bare metal by default)

Noisy NeighborYes (bare metal P4d/P5 instances with no hypervisor sharing; dedicated tenancy options available on all instance types)

Developer Experience

OnboardingSelf-service via AWS Console, CLI, or SDK; instances launchable in minutes; enterprise onboarding via AWS account teams and solution architects

FrameworksTensorFlow, PyTorch, Apache MXNet, CUDA, CuDNN

SDK LanguagesPython, Java, Go, Node.js, Ruby, PHP, C++, .NET, Rust

CLI ToolingFull AWS CLI with comprehensive EC2/SageMaker management; AWS CDK for infrastructure-as-code; Session Manager for SSH-less access

JupyterVia Amazon SageMaker Studio (native JupyterLab), SageMaker Notebooks, or self-hosted Jupyter on EC2

TemplatesSageMaker JumpStart ML models, Deep Learning AMIs (DLAMI) with PyTorch/TensorFlow, LLM fine-tuning via SageMaker, Stable Diffusion on EC2, Distributed training with SageMaker

Model MarketplaceAWS Marketplace with thousands of ML models; SageMaker JumpStart built-in model library including foundation models; Amazon Bedrock for managed LLM APIs

DocumentationComprehensive documentation with tutorials, API references, whitepapers, and AWS skill builder training courses

API FeaturesAWS CLI, SDKs for popular languages, RESTful API, AWS CloudFormation

Security & Compliance

SecurityComprehensive vulnerability assessments and regular security audits

ComplianceISO 27018, ISO 27001, ISO 22301, PCI DSS, SOC 1, SOC 2, SOC 3

ISO 27001, SOC 1/2/3, PCI DSS, HIPAA, FedRAMP High certifiedUsed by over 1 million active customers globallyNVIDIA Select Cloud Service ProviderNamed a Leader in Gartner Magic Quadrant for Cloud Infrastructure 14 consecutive yearsAWS GovCloud for US government workloads

Data Center Locations

Coverage

CountriesUnited States, Canada, Brazil, Ireland, United Kingdom, Germany, France, Italy, Spain, Sweden, Switzerland, Israel, United Arab Emirates, South Africa, India, Singapore, Japan, South Korea, Australia, New Zealand, Hong Kong, Indonesia, Malaysia, Thailand

CitiesAshburn VA, Columbus OH, Dallas TX, Portland OR, San Jose CA, Montreal, Sao Paulo, Dublin, London, Frankfurt, Paris, Milan, Madrid, Stockholm, Zurich, Tel Aviv, Dubai, Cape Town, Mumbai, Hyderabad, Singapore, Tokyo, Osaka, Seoul, Sydney, Melbourne, Auckland, Hong Kong, Jakarta, Kuala Lumpur, Bangkok

Multi-Region FailoverYes (automatic and manual failover via Route 53, Global Accelerator, and multi-AZ/multi-region architectures)

Latency TiersUltra-low (<1ms intra-AZ), Standard cloud latency inter-region, CloudFront edge for <10ms globally

North AmericaEuropeAsia-PacificSouth AmericaMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Frankfurt, Dublin, Paris, Milan, Madrid, Stockholm, Zurich)

US Gov CloudYes (FedRAMP authorized, AWS GovCloud US-East and US-West regions, DoD IL2/IL4/IL5/IL6)

India RegionYes (Mumbai, Hyderabad)

Datacenter Locations

Key Strengths

Largest global infrastructure footprint with 30+ regions ensuring low-latency access worldwide

Purpose-built AI silicon (AWS Trainium, Inferentia2) offering cost-effective alternatives to NVIDIA for training and inference

Deep integration across entire cloud stack (storage, networking, databases, ML services) enabling end-to-end ML pipelines

SageMaker managed ML platform reduces infrastructure management overhead

Broad enterprise compliance portfolio (100+ certifications) enabling regulated industries to adopt GPU workloads

Known Limitations

H100 and high-end GPU availability can be constrained; long waitlists or limited spot availability in some regions

Pricing is complex with hundreds of instance types and pricing dimensions; cost optimization requires expertise

Vendor lock-in risk due to proprietary services like SageMaker, Trainium, and Inferentia

Egress data transfer costs can be significant for large-scale ML workloads

On-demand GPU pricing is among the higher-cost options compared to specialized GPU cloud providers

Additional Information

Support Options

["24/7 technical support","Documentation","Community forums","Premium Support tiers (additional cost)"]

Community

AWS re:Post community forums, AWS Discord server, AWS Heroes program, GitHub (aws org with 500+ repos), AWS re:Invent annual conference, active Stack Overflow presence

Green Energy

Committed to 100% renewable energy by 2025 (largely achieved); net-zero carbon by 2040 under The Climate Pledge; 100+ renewable energy projects globally

PUE Rating

1.2 (global average; AWS reports 1.15-1.25 range depending on region)

Core Proposition

Broadest GPU instance portfolio (P4, P5, G5, Trn1, Inf2) integrated with the world's largest cloud ecosystem including SageMaker, EKS, and 200+ managed services.

Notable Customers

Netflix

Airbnb

NASA

Samsung

Pfizer

Goldman Sachs

Epic Games

Snap

Payment Methods

Credit CardWire TransferAWS MarketplaceInvoice (enterprise)AWS Credits

Last updated March 2026. Information subject to change.