Section G.2: Cloud Pricing Comparison

Cloud GPU pricing is volatile and varies significantly by provider, region, commitment level, and availability. The table below provides approximate on-demand hourly rates as of early 2026. Prices should be treated as rough guidelines; always check current pricing before making decisions.

GPU Comparison

GPU	AWS	GCP	Azure	Lambda	RunPod
A100 80GB (1x)	$4.10/hr (p4d)	$3.67/hr (a2)	$3.67/hr (NC A100)	$1.29/hr	$1.64/hr
H100 80GB (1x)	$8.50/hr (p5)	$8.34/hr (a3-mega)	$8.20/hr (NC H100)	$2.49/hr	$3.29/hr
H200 141GB (1x)	~$10.50/hr (p5e)	~$10.00/hr (a3-ultra)	~$10.00/hr	$3.49/hr	$4.49/hr
L40S 48GB (1x)	$2.80/hr (g6e)	$2.50/hr (g2)	$2.40/hr	$0.99/hr	$0.74/hr
8x H100 cluster	$65.00/hr (p5.48xlarge)	$64.00/hr (a3-mega)	$63.00/hr	$19.92/hr	$26.32/hr

Pricing Caveats

These figures are approximate on-demand rates. Reserved instances (1-3 year commitments) can reduce costs by 30-60%. Spot/preemptible instances offer 60-80% savings but can be interrupted. GPU-specialized providers like Lambda and RunPod typically offer lower per-GPU rates but fewer managed services. Prices change frequently, so verify before budgeting.

Cost Reduction Strategies

Spot instances with checkpointing: For training jobs that can tolerate interruptions, spot pricing offers dramatic savings. Implement checkpoint saving every 15-30 minutes to minimize lost work.
Right-size your GPU: An L40S is often sufficient for inference of quantized models up to 30B parameters. Reserve H100/H200 for training or large model serving.
Time-of-day arbitrage: Some cloud regions have lower demand overnight, potentially improving spot availability.
Multi-cloud strategy: Use the cheapest available GPU across providers for batch workloads. Tools like SkyPilot automate this.