GPU Cloud Provider · Unknown

Fly.io

Fly.io offered a range of GPU machines suited for various compute-intensive applications such as inference, model training, and high-precision computations. However, these services are being discontinued and will be unavailable after August 1.

View 1 GPU

GPUs

Founded

Unknown

Countries

Data Centers

Team Size

51-200

GPU Marketplace

NVIDIA A100 80GB SXMOn-Demand

$3.50/hour

Specs Deploy

Company Profile

Company TypeScale-up

Provider TypeCloud Provider

Legal EntityFly.io, Inc.

FundingSeries A/B/C/D

Total Raised~$70M

Team Size51-200

Andreessen Horowitz (a16z)

Infrastructure

GPU FleetNVIDIA A100 80GB, NVIDIA L40S 48GB

Network FabricCloud Hypervisor instead of Firecracker for GPU-enabled machines

StorageEphemeral storage with a capacity of up to 50GB, Fly Volumes up to 500GB

Data Center TierCarrier-neutral colocation, distributed edge

Bare MetalNo

AvailabilityDeprecated, services ending after August 1

StartupDeveloperHobbyistSMB

Compute & Deployment

On-DemandYes

Spot / InterruptibleNo

Reserved InstancesNo

Bare MetalNo

VM-BasedNo

Container-BasedYes (Docker-based via Firecracker microVMs)

KubernetesNo

Serverless GPUNo

Spin-Up TimeUnder 1 minute

TerraformYes (community provider)

GPU Hardware

Latest GenL40S

Legacy SupportA10, A100

Multi-GPU NodesYes (up to 4x per node)

Max GPUs/Node4

NVLinkNo

InfiniBandNo

PCIe vs SXMPCIe only

HGX PlatformNo

Pricing Model

Per HourYes (primary billing unit)

Per MinuteYes (per-second billing, billed per second of usage)

SubscriptionNo

Reserved DiscountNo

Spot DiscountNo spot pricing

Public PricingYes

Hidden FeesIP address charges ($0.02/hr for dedicated IPv4), volume snapshot charges

Egress Charges$0.02/GB after free allowance (100GB free per month included)

Pay-as-you-goYes

Credit SystemNo

Performance & Scaling

Multi-Node TrainingLimited (manual setup required, no managed distributed training)

Elastic ScalingManual only

Auto ScalingYes (Fly Machines auto-scaling for inference workloads)

InfiniBandNo (Ethernet only)

NVSwitchNo

Perf IsolationPartial (shared infrastructure, not bare metal)

Noisy NeighborNo (multi-tenant shared infrastructure)

Developer Experience

OnboardingDeploy in minutes via flyctl CLI or web dashboard; GPU machines provisioned similarly to standard app machines

FrameworksCUDA based applications which likely include popular ML frameworks like PyTorch and TensorFlow

CLI ToolingFull-featured flyctl CLI with SSH, log streaming, secrets management, and machine lifecycle control

JupyterVia SSH port forwarding or custom Docker image

DocumentationGood developer-focused docs with guides, API reference, and community examples; GPU-specific docs less comprehensive than core platform

Security & Compliance

Security

Backed by a16z (Series B lead)Widely used by indie developers and startupsActive open-source community and transparent engineering blogReputable founders with prior infrastructure startup experience

Data Center Locations

Coverage

CountriesUnited States, United Kingdom, Germany, France, Netherlands, Spain, Australia, Canada, Brazil, Singapore, Japan, India, South Africa, Mexico, Chile

CitiesAshburn VA, Chicago IL, Dallas TX, Los Angeles CA, Miami FL, Seattle WA, Secaucus NJ, London, Amsterdam, Frankfurt, Paris, Madrid, Sydney, Toronto, São Paulo, Singapore, Tokyo, Chennai, Johannesburg, Mexico City, Santiago

Multi-Region FailoverYes (manual configuration via flyctl)

Latency TiersStandard cloud latency

North AmericaEuropeAsia-PacificSouth AmericaAfrica

Compliance Regions

EU Data ResidencyYes (Amsterdam, Frankfurt, Paris, Madrid)

US Gov CloudNo

India RegionYes (Chennai)

Datacenter Locations

Key Strengths

Developer-first experience with minimal configuration overhead

Global Anycast network enabling low-latency edge deployments

Per-second billing with fast machine start/stop for cost efficiency

Unified platform — run GPU workloads alongside web apps and databases

Strong open-source and indie developer community ethos

Known Limitations

Limited GPU model selection compared to dedicated GPU cloud providers

No bare metal or dedicated GPU node options

GPU availability can be constrained; no SLA guarantees

Not suited for large-scale HPC or multi-node GPU cluster training

No model marketplace or prebuilt AI/ML templates

Less competitive for enterprise or research-scale GPU deployments

Additional Information

Support Options

[]

Community

Active community forum (community.fly.io), active presence on Twitter/X, and open engineering blog; no dedicated Discord

Core Proposition

Edge-native application platform that runs containerized workloads close to users across a global network of micro-regions with fast deployment via CLI.

Payment Methods

Credit Card

Last updated March 2026. Information subject to change.