GPU Cloud Provider · Unknown
Fly.io
Fly.io offered a range of GPU machines suited for various compute-intensive applications such as inference, model training, and high-precision computations. However, these services are being discontinued and will be unavailable after August 1.
GPUs
1
Founded
Unknown
Countries
15
Data Centers
21
Team Size
51-200
GPU Marketplace

NVIDIA A100 80GB SXMOn-Demand
Company Profile
Company TypeScale-up
Provider TypeCloud Provider
Legal EntityFly.io, Inc.
FundingSeries A/B/C/D
Total Raised~$70M
Team Size51-200
Andreessen Horowitz (a16z)
Infrastructure
GPU FleetNVIDIA A100 80GB, NVIDIA L40S 48GB
Network FabricCloud Hypervisor instead of Firecracker for GPU-enabled machines
StorageEphemeral storage with a capacity of up to 50GB, Fly Volumes up to 500GB
Data Center TierCarrier-neutral colocation, distributed edge
Bare MetalNo
AvailabilityDeprecated, services ending after August 1
StartupDeveloperHobbyistSMB
Compute & Deployment
On-DemandYes
Spot / InterruptibleNo
Reserved InstancesNo
Bare MetalNo
VM-BasedNo
Container-BasedYes (Docker-based via Firecracker microVMs)
KubernetesNo
Serverless GPUNo
Spin-Up TimeUnder 1 minute
TerraformYes (community provider)
GPU Hardware
Latest GenL40S
Legacy SupportA10, A100
Multi-GPU NodesYes (up to 4x per node)
Max GPUs/Node4
NVLinkNo
InfiniBandNo
PCIe vs SXMPCIe only
HGX PlatformNo
Pricing Model
Per HourYes (primary billing unit)
Per MinuteYes (per-second billing, billed per second of usage)
SubscriptionNo
Reserved DiscountNo
Spot DiscountNo spot pricing
Public PricingYes
Hidden FeesIP address charges ($0.02/hr for dedicated IPv4), volume snapshot charges
Egress Charges$0.02/GB after free allowance (100GB free per month included)
Pay-as-you-goYes
Credit SystemNo
Performance & Scaling
Multi-Node TrainingLimited (manual setup required, no managed distributed training)
Elastic ScalingManual only
Auto ScalingYes (Fly Machines auto-scaling for inference workloads)
InfiniBandNo (Ethernet only)
NVSwitchNo
Perf IsolationPartial (shared infrastructure, not bare metal)
Noisy NeighborNo (multi-tenant shared infrastructure)
Developer Experience
OnboardingDeploy in minutes via flyctl CLI or web dashboard; GPU machines provisioned similarly to standard app machines
FrameworksCUDA based applications which likely include popular ML frameworks like PyTorch and TensorFlow
CLI ToolingFull-featured flyctl CLI with SSH, log streaming, secrets management, and machine lifecycle control
JupyterVia SSH port forwarding or custom Docker image
DocumentationGood developer-focused docs with guides, API reference, and community examples; GPU-specific docs less comprehensive than core platform
Security & Compliance
Security
Backed by a16z (Series B lead)Widely used by indie developers and startupsActive open-source community and transparent engineering blogReputable founders with prior infrastructure startup experience
Data Center Locations
Coverage
CountriesUnited States, United Kingdom, Germany, France, Netherlands, Spain, Australia, Canada, Brazil, Singapore, Japan, India, South Africa, Mexico, Chile
CitiesAshburn VA, Chicago IL, Dallas TX, Los Angeles CA, Miami FL, Seattle WA, Secaucus NJ, London, Amsterdam, Frankfurt, Paris, Madrid, Sydney, Toronto, São Paulo, Singapore, Tokyo, Chennai, Johannesburg, Mexico City, Santiago
Multi-Region FailoverYes (manual configuration via flyctl)
Latency TiersStandard cloud latency
North AmericaEuropeAsia-PacificSouth AmericaAfrica
Compliance Regions
EU Data ResidencyYes (Amsterdam, Frankfurt, Paris, Madrid)
US Gov CloudNo
India RegionYes (Chennai)
Datacenter Locations
Key Strengths
Developer-first experience with minimal configuration overhead
Global Anycast network enabling low-latency edge deployments
Per-second billing with fast machine start/stop for cost efficiency
Unified platform — run GPU workloads alongside web apps and databases
Strong open-source and indie developer community ethos
Known Limitations
Limited GPU model selection compared to dedicated GPU cloud providers
No bare metal or dedicated GPU node options
GPU availability can be constrained; no SLA guarantees
Not suited for large-scale HPC or multi-node GPU cluster training
No model marketplace or prebuilt AI/ML templates
Less competitive for enterprise or research-scale GPU deployments
Additional Information
Support Options
[]
Community
Active community forum (community.fly.io), active presence on Twitter/X, and open engineering blog; no dedicated Discord
Core Proposition
Edge-native application platform that runs containerized workloads close to users across a global network of micro-regions with fast deployment via CLI.
Payment Methods
Credit Card
Last updated March 2026. Information subject to change.