Training vs Inference

Explore further

Informed By

Energy Consumption of Training & Inference

Data & Infrastructure

Informed By

Inference Scaling & Batching

Data & Infrastructure

Go Wider

Training & Optimisation

How AI Works

Training and inference are the two fundamental modes of operating an AI model, and they have very different characteristics. Training is the learning phase - the computationally expensive process of adjusting parameters using vast amounts of data. It typically happens once (or occasionally, with periodic updates) and requires enormous computing resources: thousands of specialised processors running for weeks or months. Inference is the using phase - when you send a prompt to ChatGPT and get a response, that's inference. It's much cheaper per operation than training, but because it happens millions of times per day across all users, the total cost adds up quickly. This distinction matters for understanding the economics of AI. The upfront training cost is a massive fixed investment, while inference is an ongoing variable cost that scales with usage. It's why AI providers charge per token - they're covering inference costs. It also explains why there's so much research into making inference faster and cheaper: even small efficiency gains, multiplied by billions of daily queries, translate into significant savings. When you're using AI, the cost you pay is almost entirely inference cost - you're renting access to a model that someone else spent a fortune training.