GPU Cost

RR
Ryan Rutan

GPU Cost

GPU cost is the underlying compute cost of training and running AI models, dominated by Nvidia's H100, H200, B200, and B300 chips at $25,000-$50,000 each. GPU availability and cost are the limiting factor for AI training because foundation model labs need thousands of GPUs running together in clusters. The GPU supply chain is the single largest infrastructure story of the 2020s tech boom. Behind every AI capability is a stack of expensive GPUs running hot.

The Nvidia GPU lineup (mid-2026):

GPULaunchApproximate priceUse case
A1002020$10K-$15KLegacy AI workloads
H1002022$25K-$40KMainstream LLM training/inference
H2002024$30K-$50KImproved memory for large models
B100/B2002024-2025$40K-$60KNext-gen Blackwell architecture
B300 (Blackwell Ultra)Jan 2026$40K-$60K288GB HBM3e; DGX B300 system $300-350K
GB200 NVL72 (rack-scale)2024+$3M+ per rackFull system for largest training
Vera Rubin (R100)sampling Q4 2026TBDNext-gen architecture; HBM4 288GB, 13 TB/s; DGX Rubin rack ~$3.5-4M

Total Nvidia data center GPU revenue FY2025: ~$130B+. Most went to hyperscalers (Microsoft, Google, Meta, Amazon) and OpenAI/Anthropic.

The training compute requirements:

Frontier model training (GPT-5, Claude 4, Gemini 2):

  • 10,000-100,000+ GPUs in cluster.
  • Months of training time.
  • $100M-$1B+ in compute alone (not counting data center, networking, cooling).

Mid-size training (specialized models, fine-tuning):

  • 8-256 GPUs.
  • Days to weeks.
  • $10K-$1M in compute.

Small training and fine-tuning:

  • 1-8 GPUs.
  • Hours to days.
  • $100-$10,000 in compute.

The cloud GPU pricing (2025):

ProviderServiceH100 hourlyNotes
AWSP5 instances$3-$8/hourOn-demand; reserved cheaper
Google CloudA3 instances$3-$7/hourStrong A100/H100 supply
AzureNDv5$3-$8/hourOpenAI's primary partner
Specialty (Lambda, Modal, Replicate, Together)Various$2-$6/hourBetter availability for startups

Annual cost: a single H100 running 24/7 = $25K-$70K per year (depending on provider and reservation).

The GPU shortage and supply chain:

2022-2024: severe GPU shortage. Companies waited 6-12 months for orders.

2024-2025: shortage easing as Nvidia ramps production. B200/B300 launch.

Allocation politics: hyperscalers and frontier labs get priority. Smaller startups often access via cloud providers.

Chinese export controls: US restrictions on selling high-end GPUs to China affecting supply dynamics.

Custom silicon: Google TPU, Amazon Trainium, Microsoft Maia all attempting to reduce Nvidia dependence. Apple, Meta also designing custom AI chips.

The startup implications:

Don't build foundation model labs: requires $100M-$1B+ in GPU infrastructure.

Use cloud GPUs: AWS, GCP, Azure, plus specialty providers (Lambda, Modal, Replicate, Together AI, RunPod).

Inference vs training: most startups never train; only run inference (much cheaper).

API vs self-hosted: API access (OpenAI, Anthropic) abstracts away GPU cost; self-hosting requires GPU management.

Cost per query: dominated by GPU compute amortized across users.

The economics evolution:

Per-FLOP cost decline: about 30-50% per year improvement in FLOPs per dollar.

Combined with model efficiency: total cost per useful inference dropping ~10x per 18-24 months.

Inference vs training cost ratio: shifting toward inference as model usage scales. Training is one-time; inference is forever.

The Nvidia moat:

CUDA software ecosystem: ~15 years of CUDA optimization makes switching to other GPUs hard.

Hardware advantage: industry-leading performance per chip and per system.

Network effects: AI talent trained on CUDA; libraries built for CUDA; legacy code on CUDA.

Capital allocation: Nvidia investing $30B+/year in R&D, far ahead of competitors.

Ryan's Take

GPU cost is the physics underneath your AI economics, and the right move depends on which game you're in. Building an application: use API access and don't touch GPUs. Building AI infrastructure: use specialty providers like Lambda, Modal, or Together AI. Running a foundation-model lab is a different, far more capital-intensive game entirely. The trap is self-hosting production AI with no GPU-ops expertise and praying costs fall fast enough to rescue bad unit economics, so design around today's prices.

What founders get wrong: Underestimating the GPU dimension of AI economics. Inference costs are GPU costs in disguise; understanding the GPU layer helps predict cost trajectory and plan around it. The right discipline: use cloud GPUs/APIs at small scale; understand the GPU supply chain dynamics; design pricing assuming current costs (improvements are a bonus).

Related: Inference Cost · Foundation Model · Training Data · AI Startup

FAQ

What is GPU cost?
The underlying compute cost of training and running AI models, dominated by Nvidia's data center GPU lineup (H100, H200, B200). Single chips cost $25K-$50K. GPU availability and cost is the limiting factor for AI training and a significant component of inference economics.

How much does an H100 cost?
$25K-$40K to buy. Cloud rental: $3-$8/hour on AWS/GCP/Azure, $2-$6/hour on specialty providers (Lambda, Modal, RunPod). Annual cost running 24/7: $25K-$70K depending on provider.

How many GPUs are needed to train a foundation model?
Frontier models (GPT-5, Claude 4): 10,000-100,000+ GPUs in cluster. Months of training time. $100M-$1B+ in compute alone. Mid-size training: 8-256 GPUs, days-weeks, $10K-$1M. Small fine-tuning: 1-8 GPUs, hours-days, $100-$10K.

Why is Nvidia so dominant?
CUDA software ecosystem (15+ years of optimization), industry-leading hardware performance, network effects (AI talent trained on CUDA, libraries built for CUDA), and $30B+/year R&D investment. Custom alternatives (Google TPU, Amazon Trainium, Microsoft Maia) gaining ground but Nvidia still dominant.

Find this article helpful?

This is just a small sample! Register to unlock our in-depth courses, hundreds of video courses, and a library of playbooks and articles to grow your startup fast. Let us Let us show you!

OR

GoogleLinkedInFacebookX/Twitter

Submission confirms agreement to our Terms of Service and Privacy Policy.