Underlying compute cost of AI training and inference. Nvidia H100/H200/B200 dominate at $25K-$50K per chip. Limiting factor for training; significant for inference.

GPU Cost

Q: How much does an H100 cost?

$25K-$40K to buy. Cloud rental: $3-$8/hour. Annual 24/7: $25K-$70K depending on provider.

Q: How many GPUs are needed to train a foundation model?

Frontier: 10,000-100,000+ GPUs. $100M-$1B+ compute. Mid-size: 8-256 GPUs. Small fine-tuning: 1-8 GPUs.

Q: Why is Nvidia so dominant?

CUDA software ecosystem, industry-leading hardware, network effects, $30B+/year R&D. Alternatives exist but Nvidia still dominant.

Ryan Rutan

GPU Cost

GPU cost is the underlying compute cost of training and running AI models, dominated by Nvidia's H100, H200, B200, and B300 chips at $25,000-$50,000 each. GPU availability and cost are the limiting factor for AI training because foundation model labs need thousands of GPUs running together in clusters. The GPU supply chain is the single largest infrastructure story of the 2020s tech boom. Behind every AI capability is a stack of expensive GPUs running hot.

The Nvidia GPU lineup (mid-2026):

GPU	Launch	Approximate price	Use case
A100	2020	$10K-$15K	Legacy AI workloads
H100	2022	$25K-$40K	Mainstream LLM training/inference
H200	2024	$30K-$50K	Improved memory for large models
B100/B200	2024-2025	$40K-$60K	Next-gen Blackwell architecture
B300 (Blackwell Ultra)	Jan 2026	$40K-$60K	288GB HBM3e; DGX B300 system $300-350K
GB200 NVL72 (rack-scale)	2024+	$3M+ per rack	Full system for largest training
Vera Rubin (R100)	sampling Q4 2026	TBD	Next-gen architecture; HBM4 288GB, 13 TB/s; DGX Rubin rack ~$3.5-4M

Total Nvidia data center GPU revenue FY2025: ~$130B+. Most went to hyperscalers (Microsoft, Google, Meta, Amazon) and OpenAI/Anthropic.

The training compute requirements:

Frontier model training (GPT-5, Claude 4, Gemini 2):

10,000-100,000+ GPUs in cluster.
Months of training time.
$100M-$1B+ in compute alone (not counting data center, networking, cooling).

Mid-size training (specialized models, fine-tuning):

8-256 GPUs.
Days to weeks.
$10K-$1M in compute.

Small training and fine-tuning:

1-8 GPUs.
Hours to days.
$100-$10,000 in compute.

The cloud GPU pricing (2025):

Provider	Service	H100 hourly	Notes
AWS	P5 instances	$3-$8/hour	On-demand; reserved cheaper
Google Cloud	A3 instances	$3-$7/hour	Strong A100/H100 supply
Azure	NDv5	$3-$8/hour	OpenAI's primary partner
Specialty (Lambda, Modal, Replicate, Together)	Various	$2-$6/hour	Better availability for startups

Annual cost: a single H100 running 24/7 = $25K-$70K per year (depending on provider and reservation).

The GPU shortage and supply chain:

2022-2024: severe GPU shortage. Companies waited 6-12 months for orders.

2024-2025: shortage easing as Nvidia ramps production. B200/B300 launch.

Allocation politics: hyperscalers and frontier labs get priority. Smaller startups often access via cloud providers.

Chinese export controls: US restrictions on selling high-end GPUs to China affecting supply dynamics.

Custom silicon: Google TPU, Amazon Trainium, Microsoft Maia all attempting to reduce Nvidia dependence. Apple, Meta also designing custom AI chips.

The startup implications:

Don't build foundation model labs: requires $100M-$1B+ in GPU infrastructure.

Use cloud GPUs: AWS, GCP, Azure, plus specialty providers (Lambda, Modal, Replicate, Together AI, RunPod).

Inference vs training: most startups never train; only run inference (much cheaper).

API vs self-hosted: API access (OpenAI, Anthropic) abstracts away GPU cost; self-hosting requires GPU management.

Cost per query: dominated by GPU compute amortized across users.

The economics evolution:

Per-FLOP cost decline: about 30-50% per year improvement in FLOPs per dollar.

Combined with model efficiency: total cost per useful inference dropping ~10x per 18-24 months.

Inference vs training cost ratio: shifting toward inference as model usage scales. Training is one-time; inference is forever.

The Nvidia moat:

CUDA software ecosystem: ~15 years of CUDA optimization makes switching to other GPUs hard.

Hardware advantage: industry-leading performance per chip and per system.

Network effects: AI talent trained on CUDA; libraries built for CUDA; legacy code on CUDA.

Capital allocation: Nvidia investing $30B+/year in R&D, far ahead of competitors.

Ryan's Take

GPU cost is the physics underneath your AI economics, and the right move depends on which game you're in. Building an application: use API access and don't touch GPUs. Building AI infrastructure: use specialty providers like Lambda, Modal, or Together AI. Running a foundation-model lab is a different, far more capital-intensive game entirely. The trap is self-hosting production AI with no GPU-ops expertise and praying costs fall fast enough to rescue bad unit economics, so design around today's prices.

What founders get wrong: Underestimating the GPU dimension of AI economics. Inference costs are GPU costs in disguise; understanding the GPU layer helps predict cost trajectory and plan around it. The right discipline: use cloud GPUs/APIs at small scale; understand the GPU supply chain dynamics; design pricing assuming current costs (improvements are a bonus).

Related: Inference Cost · Foundation Model · Training Data · AI Startup

FAQ

What is GPU cost?
The underlying compute cost of training and running AI models, dominated by Nvidia's data center GPU lineup (H100, H200, B200). Single chips cost $25K-$50K. GPU availability and cost is the limiting factor for AI training and a significant component of inference economics.

How much does an H100 cost?
$25K-$40K to buy. Cloud rental: $3-$8/hour on AWS/GCP/Azure, $2-$6/hour on specialty providers (Lambda, Modal, RunPod). Annual cost running 24/7: $25K-$70K depending on provider.

How many GPUs are needed to train a foundation model?
Frontier models (GPT-5, Claude 4): 10,000-100,000+ GPUs in cluster. Months of training time. $100M-$1B+ in compute alone. Mid-size training: 8-256 GPUs, days-weeks, $10K-$1M. Small fine-tuning: 1-8 GPUs, hours-days, $100-$10K.

Why is Nvidia so dominant?
CUDA software ecosystem (15+ years of optimization), industry-leading hardware performance, network effects (AI talent trained on CUDA, libraries built for CUDA), and $30B+/year R&D investment. Custom alternatives (Google TPU, Amazon Trainium, Microsoft Maia) gaining ground but Nvidia still dominant.

Find this article helpful?

This is just a small sample! Register to unlock our in-depth courses, hundreds of video courses, and a library of playbooks and articles to grow your startup fast. Let us Let us show you!

Submission confirms agreement to our Terms of Service and Privacy Policy.