GPU Cloud Provider · Seattle, Washington, USA

AWS (Amazon)

AWS G4 instances are designed to provide cost-effective and high-performance GPU capabilities for a variety of applications including machine learning inference, small scale training, and graphics-intensive tasks. These instances utilize NVIDIA T4 GPUs for G4dn models and AMD Radeon Pro V520 GPUs for G4ad models, catering to different performance and cost needs.

GPUs
1
Founded
2006 (AWS launch)
Countries
25
Data Centers
32
Uptime SLA
99.99%
Team Size
10,000+

GPU Marketplace

Company Profile

Company TypeHyperscaler
Provider TypeHyperscaler
Founded2006 (AWS launch)
HeadquartersSeattle, Washington, USA
Legal EntityAmazon Web Services, Inc.
Parent CompanyAmazon.com, Inc.
FundingPublic (NASDAQ: AMZN)
Total RaisedNot applicable (subsidiary of Amazon)
Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5.48xlarge), NVIDIA A100 80GB SXM (p4de.24xlarge), NVIDIA A100 40GB (p4d.24xlarge), NVIDIA V100 (p3 instances), NVIDIA T4 (g4dn instances), NVIDIA A10G (g5 instances), NVIDIA L4 (g6 instances), NVIDIA L40S (g6e instances), AWS Trainium (trn1 instances), AWS Inferentia (inf2 instances)
Total GPU CapacityNot disclosed (one of the largest GPU fleets globally)
Network FabricHigh-speed 100 Gbps networking, Elastic Fabric Adapter
ConnectivityUp to 100 Gbps
StorageLocal NVMe-based SSD storage
Data Center TierTier 3+ equivalent; AWS-proprietary design with redundant power and cooling across all regions
Bare MetalYes, via dedicated hosts and bare metal instance types (e.g., p4d.24xlarge bare metal)
AvailabilityGA (Generally Available)
EnterpriseStartupResearchGovernmentIndependent Software Vendors

Compute & Deployment

On-DemandYes
Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, auction-based bidding)
Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options)
Bare MetalYes (EC2 bare metal instances available for select GPU instance types)
VM-BasedYes (EC2 GPU instances: p3, p4, p5, g4, g5, g6 families)
Container-BasedYes (Docker via ECS, EKS, and AWS Batch)
KubernetesYes (managed K8s via Amazon EKS)
Serverless GPUNo (no true serverless GPU; SageMaker inference endpoints abstract infrastructure but are not serverless in the traditional sense)
Spin-Up Time2-5 minutes for on-demand; Spot may vary based on capacity availability
TerraformYes (official HashiCorp-registry provider: hashicorp/aws)

GPU Hardware

Latest GenH100 SXM, L40S, GH200
Legacy SupportA100 SXM, V100, T4, A10G
Multi-GPU NodesYes (up to 8x per node)
Max GPUs/Node8
Pool Size100,000+ GPUs
NVLinkYes (NVLink 4.0 on H100 SXM p5 nodes)
InfiniBandYes (EFA with 3,200 Gbps aggregate on p5; HDR 200Gbps on p4de)
PCIe vs SXMBoth PCIe and SXM
HGX PlatformYes (HGX H100 8-GPU on p5 instances)
Liquid CoolingAir-cooled only

Pricing Model

Per HourYes (primary billing unit)
Per MinutePer-second billing (minimum 60 seconds)
SubscriptionYes (Reserved Instances: 1-year and 3-year terms)
Reserved DiscountUp to 72% off with 3-year all-upfront Reserved Instance commitment
Spot DiscountUp to 90% off on-demand with Spot Instances
Public PricingYes
Hidden FeesIP address charges ($0.005/hr for public IPv4), data transfer between AZs ($0.01/GB each way), EBS volume charges separate from instance
Egress ChargesTiered pricing: first 100GB/month free, $0.09/GB up to 10TB, decreasing tiers beyond
Pay-as-you-goYes
Credit SystemYes (AWS Credits via promotional programs, startup credits, and education grants)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with EFA and NCCL via AWS ParallelCluster and SageMaker)
Max Cluster Size20,000+ GPUs (via EC2 UltraClusters with p4d/p5 instances)
Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups and SageMaker managed clusters)
Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling, SageMaker endpoint auto-scaling)
InfiniBandYes (EFA with 3200 Gbps aggregate bandwidth on p5.48xlarge; HDR InfiniBand on p4d via EFA over 400 Gbps per node)
NVSwitchYes (on p4d and p5 SXM instances with NVSwitch for intra-node GPU communication)
SLA99.99%
Perf IsolationYes (dedicated bare metal on p4d/p5 instances; no hypervisor overhead on metal instances)
Noisy NeighborYes (bare metal instances with dedicated hardware; no sharing on p4d/p5 UltraCluster nodes)

Developer Experience

OnboardingSelf-service account creation in minutes; GPU instances launchable via AWS Console, CLI, or SDK within minutes; enterprise onboarding with Solutions Architects available
FrameworksCUDA, CuDNN
SDK LanguagesPython, Java, Go, Node.js, Ruby, PHP, C++, .NET, Rust
CLI ToolingFull AWS CLI with comprehensive EC2 and SageMaker commands; AWS CDK for infrastructure-as-code; SSM Session Manager for SSH-less access
JupyterNative via Amazon SageMaker Studio (managed JupyterLab); also available via Amazon EMR notebooks and self-hosted on EC2
TemplatesSageMaker JumpStart (foundation models), AWS Deep Learning AMIs, AWS Deep Learning Containers, ParallelCluster for HPC, SageMaker Training Jobs, Bedrock for managed LLM inference
Model MarketplaceAWS Marketplace for ML models; Amazon Bedrock for managed foundation model APIs (Claude, Llama, Titan, Stable Diffusion, etc.); SageMaker JumpStart model hub
DocumentationComprehensive docs with extensive tutorials, API reference, whitepapers, workshops, and re:Invent session recordings
API FeaturesAWS CLI, SDK, REST API

Security & Compliance

SecurityCompliance with global security standards,Regular security assessments
ComplianceCompliant with major certifications as part of AWS's broad compliance program
ISO 27001, SOC 1/2/3, PCI DSS, HIPAA, FedRAMP High certifiedNVIDIA DGX-Ready partnerLargest cloud provider by market share (33%+)Trusted by Fortune 500 and government agencies globallyNVIDIA Elite Cloud Service ProviderAWS GovCloud for US federal compliance

Data Center Locations

Coverage

CountriesUnited States, Canada, Brazil, Ireland, United Kingdom, Germany, France, Sweden, Spain, Italy, Japan, South Korea, Singapore, Australia, India, Hong Kong, Indonesia, Malaysia, Thailand, New Zealand, South Africa, Israel, United Arab Emirates, Bahrain, China
CitiesAshburn VA, Columbus OH, Hillsboro OR, San Jose CA, Montreal, São Paulo, Dublin, London, Frankfurt, Paris, Stockholm, Milan, Zurich, Tokyo, Osaka, Seoul, Singapore, Sydney, Melbourne, Mumbai, Hyderabad, Hong Kong, Jakarta, Kuala Lumpur, Bangkok, Auckland, Cape Town, Tel Aviv, Dubai, Bahrain, Beijing, Ningxia
Multi-Region FailoverYes (automatic and manual failover via Route 53, multi-AZ, and multi-region architectures)
Latency TiersUltra-low (<1ms intra-AZ), Standard cloud latency inter-region, AWS Local Zones for <10ms to metro areas
North AmericaSouth AmericaEuropeAsia-PacificMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Dublin Ireland, Frankfurt Germany, Paris France, Stockholm Sweden, Milan Italy, Zurich Switzerland, Spain)
US Gov CloudYes (FedRAMP High authorized, AWS GovCloud US-East and US-West regions)
India RegionYes (Mumbai and Hyderabad)
Datacenter Locations

Key Strengths

Broadest GPU and AI accelerator portfolio including proprietary Trainium and Inferentia chips
Deepest ecosystem integration with S3, EKS, SageMaker, and 200+ AWS services
Global scale with 30+ regions enabling low-latency deployment worldwide
Managed ML platform (SageMaker) reduces infrastructure overhead for AI workloads
Spot Instances offer significant cost savings for fault-tolerant training workloads

Known Limitations

GPU pricing is generally higher than specialized GPU cloud providers
Complexity of AWS ecosystem can be overwhelming for smaller teams
H100 availability can be constrained; waitlists common for large allocations
Egress bandwidth costs can significantly increase total spend
Spot Instance interruptions require robust checkpointing strategies for long training runs
Proprietary Trainium/Inferentia chips require code adaptation and have limited third-party software support

Additional Information

Support Options

["24/7 phone support","Ticket submissions","Technical support plans"]

Community

Large community via AWS re:Post forums, AWS Developer Forums, re:Invent conference, AWS User Groups globally, Stack Overflow (largest cloud tag), GitHub AWS org, and active social media presence

Green Energy

Committed to 100% renewable energy by 2025 (achieved in 2023); member of The Climate Pledge targeting net-zero carbon by 2040

PUE Rating

1.2 (global average as reported by AWS)

Core Proposition

Broadest global infrastructure footprint with the widest GPU instance variety, deep AWS service integration, and mature enterprise compliance and security certifications.

Notable Customers

Netflix
NASA
Airbnb
Samsung
BMW
Goldman Sachs
Pfizer
Epic Games

Payment Methods

Credit CardACH/Bank TransferWire TransferAWS MarketplacePurchase Order (enterprise)
Last updated March 2026. Information subject to change.