GPU Cloud Provider · Seattle, Washington, USA

Amazon Web Services (AWS)

Amazon EC2 G4 instances provide cost-effective and versatile GPU options for machine learning inference and graphics-intensive applications using NVIDIA and AMD GPUs. These instances are optimized for various applications, including machine learning, gaming, and virtual workstations, providing a blend of performance and cost efficiency.

GPUs
1
Founded
Founded as part of Amazon Web Services, which was launched in 2006
Countries
26
Data Centers
39
Uptime SLA
99.99%
Team Size
10,000+

GPU Marketplace

Company Profile

Company TypeHyperscaler
Provider TypeHyperscaler
FoundedFounded as part of Amazon Web Services, which was launched in 2006
HeadquartersSeattle, Washington, USA
Legal EntityAmazon Web Services, Inc.
Parent CompanyAmazon.com, Inc.
FundingPublic (NASDAQ: AMZN)
Total RaisedNot disclosed (subsidiary of Amazon, Inc.)
Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5), NVIDIA A100 80GB SXM (p4de), NVIDIA A100 40GB (p4d), NVIDIA V100 (p3), NVIDIA T4 (g4dn), NVIDIA A10G (g5), NVIDIA L4 (g6), NVIDIA L40S (g6e), AWS Trainium (trn1), AWS Inferentia (inf2)
Network FabricUp to 100 Gbps Ethernet, support for Elastic Fabric Adapter (EFA)
ConnectivityUp to 100 Gbps
StorageLocal NVMe-based SSD storage
Data Center TierProprietary AWS data centers meeting Tier 3+ equivalents; ISO 27001, SOC 1/2/3 certified
Bare MetalYes, via bare metal EC2 instances (e.g., p4de.24xlarge, p5.48xlarge)
AvailabilityGenerally Available
EnterpriseStartupResearchGovernmentISV

Compute & Deployment

On-DemandYes
Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, auction-based bidding)
Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options)
Bare MetalYes (EC2 Bare Metal instances available for select GPU families)
VM-BasedYes (EC2 GPU instances: P4, P5, G4, G5, G6 families)
Container-BasedYes (Docker via ECS and EKS)
KubernetesYes (managed K8s via Amazon EKS)
Serverless GPUNo (SageMaker Inference handles managed inference but is not purely serverless GPU)
Spin-Up Time1-5 minutes (on-demand EC2); longer for capacity-constrained instance types like P5
TerraformYes (official HashiCorp-registry provider: hashicorp/aws)

GPU Hardware

Latest GenH100 SXM, H200 SXM, L40S, Trainium2, Inferentia2
Legacy SupportA100, V100, T4, A10G, K80
Multi-GPU NodesYes (up to 8x per node)
Max GPUs/Node8
NVLinkYes (NVLink 3.0 on p4 SXM nodes, NVLink 4.0 on p5 H100 SXM nodes)
InfiniBandYes (EFA with 3,200 Gbps aggregate on p5 instances; HDR on p4 instances)
PCIe vs SXMBoth PCIe and SXM
HGX PlatformYes (HGX H100 8-GPU on p5 instances)

Pricing Model

Per HourYes (primary billing unit)
Per MinutePer-second billing (minimum 60 seconds)
SubscriptionYes (Savings Plans: 1-year and 3-year commitments)
Reserved DiscountUp to 72% off with 3-year Reserved Instances (no upfront); up to 40% with 1-year commitment
Spot DiscountUp to 90% off on-demand with EC2 Spot Instances
Public PricingYes
Hidden FeesIP address charges ($0.005/hr for public IPv4), data transfer between AZs ($0.01/GB each way), EBS storage billed separately, CloudWatch monitoring fees
Egress ChargesFirst 100GB/month free, then tiered: $0.09/GB (up to 10TB), $0.085/GB (10–50TB), $0.07/GB (50–150TB), lower beyond that; free within same region
Pay-as-you-goYes
Credit SystemYes (AWS Credits via promotional programs, Activate for startups, Partner credits)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with EFA and NCCL via SageMaker or ParallelCluster)
Max Cluster Size20,000+ GPUs (via AWS UltraClusters with p4d/p5 instances)
Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups or SageMaker managed clusters)
Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling, SageMaker endpoint auto-scaling)
InfiniBandYes (EFA with 3200 Gbps all-to-all bandwidth on p5.48xlarge UltraClusters; HDR InfiniBand on p4d instances at 400 Gbps)
NVSwitchYes (on p4d and p5 SXM instances with NVSwitch for intra-node GPU communication)
SLA99.99%
Perf IsolationYes (dedicated bare metal available on p4d, p5, and Inf instances; metal instance types provide no hypervisor sharing)
Noisy NeighborYes (bare metal instances with no sharing on p4d.24xlarge and p5.48xlarge; dedicated tenancy options available)

Developer Experience

OnboardingSelf-service via AWS Console or CLI; deploy in minutes with existing AWS account; enterprise onboarding with dedicated SA available
FrameworksSupport for major machine learning frameworks compatible with NVIDIA and AMD GPUs
SDK LanguagesPython, Java, JavaScript, TypeScript, Go, Ruby, PHP, .NET, C++, Rust
CLI ToolingFull AWS CLI with extensive EC2, SageMaker, and EKS support; CloudFormation and CDK for infrastructure as code
JupyterNative via Amazon SageMaker Studio (managed JupyterLab); also supported on EC2 via self-managed setup
TemplatesLLM Training on SageMaker, Stable Diffusion on EC2, PyTorch Training, TensorFlow Training, Hugging Face on SageMaker, Ray on AWS, Deep Learning AMIs
Model MarketplaceAWS Marketplace with AI/ML models; Amazon Bedrock for managed foundation models; SageMaker JumpStart model hub
DocumentationComprehensive docs with tutorials, API reference, whitepapers, and AWS re:Post community Q&A
API FeaturesAWS CLI, SDK support for popular programming languages, REST API, AWS CloudFormation

Security & Compliance

SecurityRegular security assessments,Complies with AWS's comprehensive security protocols
ComplianceCompliance with industry standards as part of AWS's broad compliance portfolio
ISO 27001, SOC 1/2/3, PCI DSS, HIPAA eligibleFedRAMP High authorizedLargest cloud provider by market share (~32%)Used by majority of Fortune 500 companiesAWS re:Invent annual conference with 50,000+ attendeesNVIDIA DGX-Ready Cloud Partner

Data Center Locations

Coverage

CountriesUnited States, Germany, Ireland, United Kingdom, France, Sweden, Spain, Italy, Japan, South Korea, Singapore, Australia, India, Canada, Brazil, South Africa, United Arab Emirates, Israel, Bahrain, Malaysia, Indonesia, Thailand, New Zealand, Hong Kong, Taiwan, China
CitiesAshburn VA, Columbus OH, San Jose CA, Seattle WA, Portland OR, Miami FL, Dallas TX, Chicago IL, New York NY, Frankfurt, Dublin, London, Paris, Stockholm, Madrid, Milan, Tokyo, Osaka, Seoul, Singapore, Sydney, Melbourne, Mumbai, Pune, Toronto, Montreal, Sao Paulo, Cape Town, Dubai, Tel Aviv, Bahrain, Kuala Lumpur, Jakarta, Bangkok, Auckland, Hong Kong, Taipei, Beijing, Ningxia
Multi-Region FailoverYes (automatic and manual failover via Route 53, Global Accelerator, and multi-AZ/multi-region architectures)
Latency TiersUltra-low (<1ms intra-AZ), low (1-10ms inter-AZ), Standard cloud latency inter-region; CloudFront edge at <10ms for many endpoints
North AmericaEuropeAsia-PacificSouth AmericaMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Frankfurt, Dublin, London, Paris, Stockholm, Madrid, Milan — GDPR compliant with AWS Data Processing Addendum)
US Gov CloudYes (FedRAMP High authorized — AWS GovCloud US-East and US-West regions, DoD IL2/IL4/IL5 compliant)
India RegionYes (Mumbai ap-south-1, Hyderabad ap-south-2)
Datacenter Locations

Key Strengths

Broadest GPU instance portfolio including proprietary Trainium and Inferentia chips
Deep integration with managed ML platform SageMaker for end-to-end MLOps
Unmatched global infrastructure with 30+ regions and 90+ availability zones
Extensive partner ecosystem and AWS Marketplace for AI/ML tools
EFA (Elastic Fabric Adapter) for ultra-low latency GPU cluster networking

Known Limitations

Complex pricing model with many instance types and add-on costs
H100 and latest GPU instances frequently face availability constraints in popular regions
Proprietary Trainium/Inferentia chips require custom SDK (Neuron) with limited framework support
SageMaker abstraction can reduce flexibility for advanced ML engineers
Egress costs can be significant for large-scale data transfer out of AWS

Additional Information

Support Options

["24/7 support through AWS Support plans (Basic, Developer, Business, Enterprise)"]

Community

AWS re:Post community forums, GitHub (aws org with 500+ repos), active Stack Overflow presence, AWS User Groups globally, Discord and Slack communities for specific services

Green Energy

Committed to 100% renewable energy by 2025 (achieved in 2023); net-zero carbon by 2040 under The Climate Pledge

PUE Rating

1.2 (AWS-reported global average)

Core Proposition

Broadest GPU instance portfolio with deepest integration across managed ML services, networking, storage, and global infrastructure at hyperscale.

Notable Customers

Netflix
Airbnb
Samsung
BMW
Goldman Sachs
Pfizer
NASA
Snap

Payment Methods

Credit CardWire TransferAWS MarketplaceInvoice (enterprise)
Last updated March 2026. Information subject to change.