GPU Cloud Provider · Seattle, Washington, USA

Amazon Web Services (AWS)

Amazon EC2 G4 instances provide cost-effective and versatile GPU options for machine learning inference and graphics-intensive applications using NVIDIA and AMD GPUs. These instances are optimized for various applications, including machine learning, gaming, and virtual workstations, providing a blend of performance and cost efficiency.

View 1 GPU

GPUs

Founded

Founded as part of Amazon Web Services, which was launched in 2006

Countries

Data Centers

Uptime SLA

99.99%

Team Size

10,000+

GPU Marketplace

NVIDIA A100 80GB PCIeOn-Demand

$1.19/hour

Specs Deploy

Company Profile

Company TypeHyperscaler

Provider TypeHyperscaler

FoundedFounded as part of Amazon Web Services, which was launched in 2006

HeadquartersSeattle, Washington, USA

Legal EntityAmazon Web Services, Inc.

Parent CompanyAmazon.com, Inc.

FundingPublic (NASDAQ: AMZN)

Total RaisedNot disclosed (subsidiary of Amazon, Inc.)

Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5), NVIDIA A100 80GB SXM (p4de), NVIDIA A100 40GB (p4d), NVIDIA V100 (p3), NVIDIA T4 (g4dn), NVIDIA A10G (g5), NVIDIA L4 (g6), NVIDIA L40S (g6e), AWS Trainium (trn1), AWS Inferentia (inf2)

Network FabricUp to 100 Gbps Ethernet, support for Elastic Fabric Adapter (EFA)

ConnectivityUp to 100 Gbps

StorageLocal NVMe-based SSD storage

Data Center TierProprietary AWS data centers meeting Tier 3+ equivalents; ISO 27001, SOC 1/2/3 certified

Bare MetalYes, via bare metal EC2 instances (e.g., p4de.24xlarge, p5.48xlarge)

AvailabilityGenerally Available

EnterpriseStartupResearchGovernmentISV

Compute & Deployment

On-DemandYes

Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, auction-based bidding)

Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options)

Bare MetalYes (EC2 Bare Metal instances available for select GPU families)

VM-BasedYes (EC2 GPU instances: P4, P5, G4, G5, G6 families)

Container-BasedYes (Docker via ECS and EKS)

KubernetesYes (managed K8s via Amazon EKS)

Serverless GPUNo (SageMaker Inference handles managed inference but is not purely serverless GPU)

Spin-Up Time1-5 minutes (on-demand EC2); longer for capacity-constrained instance types like P5

TerraformYes (official HashiCorp-registry provider: hashicorp/aws)

GPU Hardware

Latest GenH100 SXM, H200 SXM, L40S, Trainium2, Inferentia2

Legacy SupportA100, V100, T4, A10G, K80

Multi-GPU NodesYes (up to 8x per node)

Max GPUs/Node8

NVLinkYes (NVLink 3.0 on p4 SXM nodes, NVLink 4.0 on p5 H100 SXM nodes)

InfiniBandYes (EFA with 3,200 Gbps aggregate on p5 instances; HDR on p4 instances)

PCIe vs SXMBoth PCIe and SXM

HGX PlatformYes (HGX H100 8-GPU on p5 instances)

Pricing Model

Per HourYes (primary billing unit)

Per MinutePer-second billing (minimum 60 seconds)

SubscriptionYes (Savings Plans: 1-year and 3-year commitments)

Reserved DiscountUp to 72% off with 3-year Reserved Instances (no upfront); up to 40% with 1-year commitment

Spot DiscountUp to 90% off on-demand with EC2 Spot Instances

Public PricingYes

Hidden FeesIP address charges ($0.005/hr for public IPv4), data transfer between AZs ($0.01/GB each way), EBS storage billed separately, CloudWatch monitoring fees

Egress ChargesFirst 100GB/month free, then tiered: $0.09/GB (up to 10TB), $0.085/GB (10–50TB), $0.07/GB (50–150TB), lower beyond that; free within same region

Pay-as-you-goYes

Credit SystemYes (AWS Credits via promotional programs, Activate for startups, Partner credits)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with EFA and NCCL via SageMaker or ParallelCluster)

Max Cluster Size20,000+ GPUs (via AWS UltraClusters with p4d/p5 instances)

Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups or SageMaker managed clusters)

Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling, SageMaker endpoint auto-scaling)

InfiniBandYes (EFA with 3200 Gbps all-to-all bandwidth on p5.48xlarge UltraClusters; HDR InfiniBand on p4d instances at 400 Gbps)

NVSwitchYes (on p4d and p5 SXM instances with NVSwitch for intra-node GPU communication)

SLA99.99%

Perf IsolationYes (dedicated bare metal available on p4d, p5, and Inf instances; metal instance types provide no hypervisor sharing)

Noisy NeighborYes (bare metal instances with no sharing on p4d.24xlarge and p5.48xlarge; dedicated tenancy options available)

Developer Experience

OnboardingSelf-service via AWS Console or CLI; deploy in minutes with existing AWS account; enterprise onboarding with dedicated SA available

FrameworksSupport for major machine learning frameworks compatible with NVIDIA and AMD GPUs

SDK LanguagesPython, Java, JavaScript, TypeScript, Go, Ruby, PHP, .NET, C++, Rust

CLI ToolingFull AWS CLI with extensive EC2, SageMaker, and EKS support; CloudFormation and CDK for infrastructure as code

JupyterNative via Amazon SageMaker Studio (managed JupyterLab); also supported on EC2 via self-managed setup

TemplatesLLM Training on SageMaker, Stable Diffusion on EC2, PyTorch Training, TensorFlow Training, Hugging Face on SageMaker, Ray on AWS, Deep Learning AMIs

Model MarketplaceAWS Marketplace with AI/ML models; Amazon Bedrock for managed foundation models; SageMaker JumpStart model hub

DocumentationComprehensive docs with tutorials, API reference, whitepapers, and AWS re:Post community Q&A

API FeaturesAWS CLI, SDK support for popular programming languages, REST API, AWS CloudFormation

Security & Compliance

SecurityRegular security assessments,Complies with AWS's comprehensive security protocols

ComplianceCompliance with industry standards as part of AWS's broad compliance portfolio

ISO 27001, SOC 1/2/3, PCI DSS, HIPAA eligibleFedRAMP High authorizedLargest cloud provider by market share (~32%)Used by majority of Fortune 500 companiesAWS re:Invent annual conference with 50,000+ attendeesNVIDIA DGX-Ready Cloud Partner

Data Center Locations

Coverage

CountriesUnited States, Germany, Ireland, United Kingdom, France, Sweden, Spain, Italy, Japan, South Korea, Singapore, Australia, India, Canada, Brazil, South Africa, United Arab Emirates, Israel, Bahrain, Malaysia, Indonesia, Thailand, New Zealand, Hong Kong, Taiwan, China

CitiesAshburn VA, Columbus OH, San Jose CA, Seattle WA, Portland OR, Miami FL, Dallas TX, Chicago IL, New York NY, Frankfurt, Dublin, London, Paris, Stockholm, Madrid, Milan, Tokyo, Osaka, Seoul, Singapore, Sydney, Melbourne, Mumbai, Pune, Toronto, Montreal, Sao Paulo, Cape Town, Dubai, Tel Aviv, Bahrain, Kuala Lumpur, Jakarta, Bangkok, Auckland, Hong Kong, Taipei, Beijing, Ningxia

Multi-Region FailoverYes (automatic and manual failover via Route 53, Global Accelerator, and multi-AZ/multi-region architectures)

Latency TiersUltra-low (<1ms intra-AZ), low (1-10ms inter-AZ), Standard cloud latency inter-region; CloudFront edge at <10ms for many endpoints

North AmericaEuropeAsia-PacificSouth AmericaMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Frankfurt, Dublin, London, Paris, Stockholm, Madrid, Milan — GDPR compliant with AWS Data Processing Addendum)

US Gov CloudYes (FedRAMP High authorized — AWS GovCloud US-East and US-West regions, DoD IL2/IL4/IL5 compliant)

India RegionYes (Mumbai ap-south-1, Hyderabad ap-south-2)

Datacenter Locations

Key Strengths

Broadest GPU instance portfolio including proprietary Trainium and Inferentia chips

Deep integration with managed ML platform SageMaker for end-to-end MLOps

Unmatched global infrastructure with 30+ regions and 90+ availability zones

Extensive partner ecosystem and AWS Marketplace for AI/ML tools

EFA (Elastic Fabric Adapter) for ultra-low latency GPU cluster networking

Known Limitations

Complex pricing model with many instance types and add-on costs

H100 and latest GPU instances frequently face availability constraints in popular regions

Proprietary Trainium/Inferentia chips require custom SDK (Neuron) with limited framework support

SageMaker abstraction can reduce flexibility for advanced ML engineers

Egress costs can be significant for large-scale data transfer out of AWS

Additional Information

Support Options

["24/7 support through AWS Support plans (Basic, Developer, Business, Enterprise)"]

Community

AWS re:Post community forums, GitHub (aws org with 500+ repos), active Stack Overflow presence, AWS User Groups globally, Discord and Slack communities for specific services

Green Energy

Committed to 100% renewable energy by 2025 (achieved in 2023); net-zero carbon by 2040 under The Climate Pledge

PUE Rating

1.2 (AWS-reported global average)

Core Proposition

Broadest GPU instance portfolio with deepest integration across managed ML services, networking, storage, and global infrastructure at hyperscale.

Notable Customers

Netflix

Airbnb

Samsung

BMW

Goldman Sachs

Pfizer

NASA

Snap

Payment Methods

Credit CardWire TransferAWS MarketplaceInvoice (enterprise)

Last updated March 2026. Information subject to change.