GPU Cloud Provider · Seattle, Washington, USA

AWS (Amazon)

AWS G4 instances are designed to provide cost-effective and high-performance GPU capabilities for a variety of applications including machine learning inference, small scale training, and graphics-intensive tasks. These instances utilize NVIDIA T4 GPUs for G4dn models and AMD Radeon Pro V520 GPUs for G4ad models, catering to different performance and cost needs.

View 1 GPU

GPUs

Founded

2006 (AWS launch)

Countries

Data Centers

Uptime SLA

99.99%

Team Size

10,000+

GPU Marketplace

NVIDIA A100 80GB PCIeOn-Demand

$10.00/hour

Specs Deploy

Company Profile

Company TypeHyperscaler

Provider TypeHyperscaler

Founded2006 (AWS launch)

HeadquartersSeattle, Washington, USA

Legal EntityAmazon Web Services, Inc.

Parent CompanyAmazon.com, Inc.

FundingPublic (NASDAQ: AMZN)

Total RaisedNot applicable (subsidiary of Amazon)

Team Size10,000+

Infrastructure

GPU FleetNVIDIA H100 SXM (p5.48xlarge), NVIDIA A100 80GB SXM (p4de.24xlarge), NVIDIA A100 40GB (p4d.24xlarge), NVIDIA V100 (p3 instances), NVIDIA T4 (g4dn instances), NVIDIA A10G (g5 instances), NVIDIA L4 (g6 instances), NVIDIA L40S (g6e instances), AWS Trainium (trn1 instances), AWS Inferentia (inf2 instances)

Total GPU CapacityNot disclosed (one of the largest GPU fleets globally)

Network FabricHigh-speed 100 Gbps networking, Elastic Fabric Adapter

ConnectivityUp to 100 Gbps

StorageLocal NVMe-based SSD storage

Data Center TierTier 3+ equivalent; AWS-proprietary design with redundant power and cooling across all regions

Bare MetalYes, via dedicated hosts and bare metal instance types (e.g., p4d.24xlarge bare metal)

AvailabilityGA (Generally Available)

EnterpriseStartupResearchGovernmentIndependent Software Vendors

Compute & Deployment

On-DemandYes

Spot / InterruptibleYes (up to 90% savings via EC2 Spot Instances, auction-based bidding)

Reserved InstancesYes (1-year and 3-year terms, Standard and Convertible options)

Bare MetalYes (EC2 bare metal instances available for select GPU instance types)

VM-BasedYes (EC2 GPU instances: p3, p4, p5, g4, g5, g6 families)

Container-BasedYes (Docker via ECS, EKS, and AWS Batch)

KubernetesYes (managed K8s via Amazon EKS)

Serverless GPUNo (no true serverless GPU; SageMaker inference endpoints abstract infrastructure but are not serverless in the traditional sense)

Spin-Up Time2-5 minutes for on-demand; Spot may vary based on capacity availability

TerraformYes (official HashiCorp-registry provider: hashicorp/aws)

GPU Hardware

Latest GenH100 SXM, L40S, GH200

Legacy SupportA100 SXM, V100, T4, A10G

Multi-GPU NodesYes (up to 8x per node)

Max GPUs/Node8

Pool Size100,000+ GPUs

NVLinkYes (NVLink 4.0 on H100 SXM p5 nodes)

InfiniBandYes (EFA with 3,200 Gbps aggregate on p5; HDR 200Gbps on p4de)

PCIe vs SXMBoth PCIe and SXM

HGX PlatformYes (HGX H100 8-GPU on p5 instances)

Liquid CoolingAir-cooled only

Pricing Model

Per HourYes (primary billing unit)

Per MinutePer-second billing (minimum 60 seconds)

SubscriptionYes (Reserved Instances: 1-year and 3-year terms)

Reserved DiscountUp to 72% off with 3-year all-upfront Reserved Instance commitment

Spot DiscountUp to 90% off on-demand with Spot Instances

Public PricingYes

Hidden FeesIP address charges ($0.005/hr for public IPv4), data transfer between AZs ($0.01/GB each way), EBS volume charges separate from instance

Egress ChargesTiered pricing: first 100GB/month free, $0.09/GB up to 10TB, decreasing tiers beyond

Pay-as-you-goYes

Credit SystemYes (AWS Credits via promotional programs, startup credits, and education grants)

Performance & Scaling

Multi-Node TrainingYes (up to 1000+ nodes with EFA and NCCL via AWS ParallelCluster and SageMaker)

Max Cluster Size20,000+ GPUs (via EC2 UltraClusters with p4d/p5 instances)

Elastic ScalingYes (add/remove nodes dynamically via Auto Scaling Groups and SageMaker managed clusters)

Auto ScalingYes (policy-based auto-scaling via EC2 Auto Scaling, SageMaker endpoint auto-scaling)

InfiniBandYes (EFA with 3200 Gbps aggregate bandwidth on p5.48xlarge; HDR InfiniBand on p4d via EFA over 400 Gbps per node)

NVSwitchYes (on p4d and p5 SXM instances with NVSwitch for intra-node GPU communication)

SLA99.99%

Perf IsolationYes (dedicated bare metal on p4d/p5 instances; no hypervisor overhead on metal instances)

Noisy NeighborYes (bare metal instances with dedicated hardware; no sharing on p4d/p5 UltraCluster nodes)

Developer Experience

OnboardingSelf-service account creation in minutes; GPU instances launchable via AWS Console, CLI, or SDK within minutes; enterprise onboarding with Solutions Architects available

FrameworksCUDA, CuDNN

SDK LanguagesPython, Java, Go, Node.js, Ruby, PHP, C++, .NET, Rust

CLI ToolingFull AWS CLI with comprehensive EC2 and SageMaker commands; AWS CDK for infrastructure-as-code; SSM Session Manager for SSH-less access

JupyterNative via Amazon SageMaker Studio (managed JupyterLab); also available via Amazon EMR notebooks and self-hosted on EC2

TemplatesSageMaker JumpStart (foundation models), AWS Deep Learning AMIs, AWS Deep Learning Containers, ParallelCluster for HPC, SageMaker Training Jobs, Bedrock for managed LLM inference

Model MarketplaceAWS Marketplace for ML models; Amazon Bedrock for managed foundation model APIs (Claude, Llama, Titan, Stable Diffusion, etc.); SageMaker JumpStart model hub

DocumentationComprehensive docs with extensive tutorials, API reference, whitepapers, workshops, and re:Invent session recordings

API FeaturesAWS CLI, SDK, REST API

Security & Compliance

SecurityCompliance with global security standards,Regular security assessments

ComplianceCompliant with major certifications as part of AWS's broad compliance program

ISO 27001, SOC 1/2/3, PCI DSS, HIPAA, FedRAMP High certifiedNVIDIA DGX-Ready partnerLargest cloud provider by market share (33%+)Trusted by Fortune 500 and government agencies globallyNVIDIA Elite Cloud Service ProviderAWS GovCloud for US federal compliance

Data Center Locations

Coverage

CountriesUnited States, Canada, Brazil, Ireland, United Kingdom, Germany, France, Sweden, Spain, Italy, Japan, South Korea, Singapore, Australia, India, Hong Kong, Indonesia, Malaysia, Thailand, New Zealand, South Africa, Israel, United Arab Emirates, Bahrain, China

CitiesAshburn VA, Columbus OH, Hillsboro OR, San Jose CA, Montreal, São Paulo, Dublin, London, Frankfurt, Paris, Stockholm, Milan, Zurich, Tokyo, Osaka, Seoul, Singapore, Sydney, Melbourne, Mumbai, Hyderabad, Hong Kong, Jakarta, Kuala Lumpur, Bangkok, Auckland, Cape Town, Tel Aviv, Dubai, Bahrain, Beijing, Ningxia

Multi-Region FailoverYes (automatic and manual failover via Route 53, multi-AZ, and multi-region architectures)

Latency TiersUltra-low (<1ms intra-AZ), Standard cloud latency inter-region, AWS Local Zones for <10ms to metro areas

North AmericaSouth AmericaEuropeAsia-PacificMiddle EastAfrica

Compliance Regions

EU Data ResidencyYes (Dublin Ireland, Frankfurt Germany, Paris France, Stockholm Sweden, Milan Italy, Zurich Switzerland, Spain)

US Gov CloudYes (FedRAMP High authorized, AWS GovCloud US-East and US-West regions)

India RegionYes (Mumbai and Hyderabad)

Datacenter Locations

Key Strengths

Broadest GPU and AI accelerator portfolio including proprietary Trainium and Inferentia chips

Deepest ecosystem integration with S3, EKS, SageMaker, and 200+ AWS services

Global scale with 30+ regions enabling low-latency deployment worldwide

Managed ML platform (SageMaker) reduces infrastructure overhead for AI workloads

Spot Instances offer significant cost savings for fault-tolerant training workloads

Known Limitations

GPU pricing is generally higher than specialized GPU cloud providers

Complexity of AWS ecosystem can be overwhelming for smaller teams

H100 availability can be constrained; waitlists common for large allocations

Egress bandwidth costs can significantly increase total spend

Spot Instance interruptions require robust checkpointing strategies for long training runs

Proprietary Trainium/Inferentia chips require code adaptation and have limited third-party software support

Additional Information

Support Options

["24/7 phone support","Ticket submissions","Technical support plans"]

Community

Large community via AWS re:Post forums, AWS Developer Forums, re:Invent conference, AWS User Groups globally, Stack Overflow (largest cloud tag), GitHub AWS org, and active social media presence

Green Energy

Committed to 100% renewable energy by 2025 (achieved in 2023); member of The Climate Pledge targeting net-zero carbon by 2040

PUE Rating

1.2 (global average as reported by AWS)

Core Proposition

Broadest global infrastructure footprint with the widest GPU instance variety, deep AWS service integration, and mature enterprise compliance and security certifications.

Notable Customers

Netflix

NASA

Airbnb

Samsung

BMW

Goldman Sachs

Pfizer

Epic Games

Payment Methods

Credit CardACH/Bank TransferWire TransferAWS MarketplacePurchase Order (enterprise)

Last updated March 2026. Information subject to change.