GPU Cloud Provider ยท Seattle, WA, USA

AWS

AWS offers G4 instances optimized for machine learning inference and graphics-intensive applications. These instances provide cost-effective options using NVIDIA or AMD GPUs, specifically tailored for tasks like AI model deployment, gaming, and high-end graphics rendering.

GPUs
1
Founded
2006
Countries
24
Data Centers
31
Uptime SLA
99.99%
Team Size
10,000+

GPU Marketplace

$42.00/hour

Company Profile

Company TypeHyperscaler
Provider TypeHyperscaler
Founded2006
HeadquartersSeattle, WA, USA
Legal EntityAmazon Web Services, Inc.
Parent CompanyAmazon.com, Inc.
FundingPublic (NASDAQ: AMZN)
Total RaisedNot applicable (subsidiary of Amazon)
Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5.48xlarge), NVIDIA H200 (p5e instances), NVIDIA A100 80GB SXM (p4de.24xlarge), NVIDIA A100 40GB (p4d.24xlarge), NVIDIA V100 (p3 instances), NVIDIA T4 (g4dn instances), NVIDIA L4 (g6 instances), NVIDIA L40S (g6e instances), AWS Trainium (trn1 instances), AWS Inferentia2 (inf2 instances)
Network Fabric100 Gbps Ethernet, Elastic Fabric Adapter
ConnectivityUp to 100 Gbps
StorageLocal NVMe-based SSD, Amazon EBS (Elastic Block Store), Amazon S3 (Simple Storage Service)
Data Center TierTier 3+ equivalent; proprietary AWS data centers with custom power and cooling infrastructure
Bare MetalYes, via EC2 bare metal instances (e.g., p4de.24xlarge, p5.48xlarge)
AvailabilityGA (Generally Available)
EnterpriseStartupResearchGovernmentEducation

Compute & Deployment

On-DemandYes
Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, market-based pricing)
Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options, Savings Plans also available)
Bare MetalYes (EC2 Bare Metal instances available for select GPU instance types)
VM-BasedYes (EC2 GPU instances: P3, P4, P5, G4, G5, G6, Trn1, Inf2)
Container-BasedYes (Docker via ECS, ECR, and EKS; Fargate for serverless containers)
KubernetesYes (managed K8s via Amazon EKS, with GPU node group support and Neuron device plugin)
Serverless GPUYes (AWS Inferentia via SageMaker Serverless Inference; limited GPU serverless options)
Spin-Up Time2-5 minutes for standard GPU instances; can vary to 10+ minutes for large P5 instances or during capacity constraints
TerraformYes (official HashiCorp-registry provider: hashicorp/aws, actively maintained by AWS and HashiCorp)

GPU Hardware

Latest GenH100 SXM, H100 PCIe, L40S, Trainium2
Legacy SupportA100, V100, T4, A10G, K80
Multi-GPU NodesYes (up to 8x per node)
Max GPUs/Node8
NVLinkYes (NVLink 4.0 on SXM nodes)
InfiniBandYes (EFA with 3,200 Gbps aggregate on p5 instances)
PCIe vs SXMBoth PCIe and SXM
HGX PlatformYes (HGX H100 8-GPU on p5 instances)
Liquid CoolingSelect SKUs (p5 instances with H100 SXM)

Pricing Model

Per HourYes (primary billing unit)
Per MinutePer-second billing (minimum 60 seconds)
SubscriptionYes (Reserved Instances: 1-year and 3-year terms)
Reserved DiscountUp to 72% off with 3-year all-upfront Reserved Instance commitment
Spot DiscountUp to 90% off on-demand with Spot Instances
Public PricingYes
Hidden FeesIP address charges ($0.005/hr for public IPv4), EBS storage not included in instance price, data transfer between AZs ($0.01/GB each way)
Egress Charges$0.09/GB for first 10TB/month, tiered pricing down to $0.05/GB over 150TB; free within same region
Pay-as-you-goYes
Credit SystemYes (AWS Credits via promotional programs and enterprise agreements)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with NCCL and AWS ParallelCluster/EFA)
Max Cluster Size20,000+ GPUs (via UltraClusters with P4d/P5 instances)
Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups and SageMaker)
Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling and SageMaker endpoint scaling)
InfiniBandYes (EFA with 3200 Gbps all-to-all bandwidth on P5 UltraClusters; HDR InfiniBand on P4d instances at 400 Gbps per node)
NVSwitchYes (on P4d and P5 SXM instances with NVSwitch for intra-node GPU communication)
SLA99.99%
Perf IsolationYes (dedicated bare metal via EC2 bare metal instances; P4d and P5 are bare metal by default)
Noisy NeighborYes (bare metal P4d/P5 instances with no hypervisor sharing; dedicated tenancy options available on all instance types)

Developer Experience

OnboardingSelf-service via AWS Console, CLI, or SDK; instances launchable in minutes; enterprise onboarding via AWS account teams and solution architects
FrameworksTensorFlow, PyTorch, Apache MXNet, CUDA, CuDNN
SDK LanguagesPython, Java, Go, Node.js, Ruby, PHP, C++, .NET, Rust
CLI ToolingFull AWS CLI with comprehensive EC2/SageMaker management; AWS CDK for infrastructure-as-code; Session Manager for SSH-less access
JupyterVia Amazon SageMaker Studio (native JupyterLab), SageMaker Notebooks, or self-hosted Jupyter on EC2
TemplatesSageMaker JumpStart ML models, Deep Learning AMIs (DLAMI) with PyTorch/TensorFlow, LLM fine-tuning via SageMaker, Stable Diffusion on EC2, Distributed training with SageMaker
Model MarketplaceAWS Marketplace with thousands of ML models; SageMaker JumpStart built-in model library including foundation models; Amazon Bedrock for managed LLM APIs
DocumentationComprehensive documentation with tutorials, API references, whitepapers, and AWS skill builder training courses
API FeaturesAWS CLI, SDKs for popular languages, RESTful API, AWS CloudFormation

Security & Compliance

SecurityComprehensive vulnerability assessments and regular security audits
ComplianceISO 27018, ISO 27001, ISO 22301, PCI DSS, SOC 1, SOC 2, SOC 3
ISO 27001, SOC 1/2/3, PCI DSS, HIPAA, FedRAMP High certifiedUsed by over 1 million active customers globallyNVIDIA Select Cloud Service ProviderNamed a Leader in Gartner Magic Quadrant for Cloud Infrastructure 14 consecutive yearsAWS GovCloud for US government workloads

Data Center Locations

Coverage

CountriesUnited States, Canada, Brazil, Ireland, United Kingdom, Germany, France, Italy, Spain, Sweden, Switzerland, Israel, United Arab Emirates, South Africa, India, Singapore, Japan, South Korea, Australia, New Zealand, Hong Kong, Indonesia, Malaysia, Thailand
CitiesAshburn VA, Columbus OH, Dallas TX, Portland OR, San Jose CA, Montreal, Sao Paulo, Dublin, London, Frankfurt, Paris, Milan, Madrid, Stockholm, Zurich, Tel Aviv, Dubai, Cape Town, Mumbai, Hyderabad, Singapore, Tokyo, Osaka, Seoul, Sydney, Melbourne, Auckland, Hong Kong, Jakarta, Kuala Lumpur, Bangkok
Multi-Region FailoverYes (automatic and manual failover via Route 53, Global Accelerator, and multi-AZ/multi-region architectures)
Latency TiersUltra-low (<1ms intra-AZ), Standard cloud latency inter-region, CloudFront edge for <10ms globally
North AmericaEuropeAsia-PacificSouth AmericaMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Frankfurt, Dublin, Paris, Milan, Madrid, Stockholm, Zurich)
US Gov CloudYes (FedRAMP authorized, AWS GovCloud US-East and US-West regions, DoD IL2/IL4/IL5/IL6)
India RegionYes (Mumbai, Hyderabad)
Datacenter Locations

Key Strengths

Largest global infrastructure footprint with 30+ regions ensuring low-latency access worldwide
Purpose-built AI silicon (AWS Trainium, Inferentia2) offering cost-effective alternatives to NVIDIA for training and inference
Deep integration across entire cloud stack (storage, networking, databases, ML services) enabling end-to-end ML pipelines
SageMaker managed ML platform reduces infrastructure management overhead
Broad enterprise compliance portfolio (100+ certifications) enabling regulated industries to adopt GPU workloads

Known Limitations

H100 and high-end GPU availability can be constrained; long waitlists or limited spot availability in some regions
Pricing is complex with hundreds of instance types and pricing dimensions; cost optimization requires expertise
Vendor lock-in risk due to proprietary services like SageMaker, Trainium, and Inferentia
Egress data transfer costs can be significant for large-scale ML workloads
On-demand GPU pricing is among the higher-cost options compared to specialized GPU cloud providers

Additional Information

Support Options

["24/7 technical support","Documentation","Community forums","Premium Support tiers (additional cost)"]

Community

AWS re:Post community forums, AWS Discord server, AWS Heroes program, GitHub (aws org with 500+ repos), AWS re:Invent annual conference, active Stack Overflow presence

Green Energy

Committed to 100% renewable energy by 2025 (largely achieved); net-zero carbon by 2040 under The Climate Pledge; 100+ renewable energy projects globally

PUE Rating

1.2 (global average; AWS reports 1.15-1.25 range depending on region)

Core Proposition

Broadest GPU instance portfolio (P4, P5, G5, Trn1, Inf2) integrated with the world's largest cloud ecosystem including SageMaker, EKS, and 200+ managed services.

Notable Customers

Netflix
Airbnb
NASA
Samsung
Pfizer
Goldman Sachs
Epic Games
Snap

Payment Methods

Credit CardWire TransferAWS MarketplaceInvoice (enterprise)AWS Credits
Last updated March 2026. Information subject to change.