Transfer Learning

Explore further

Go Wider

How Machines Learn

How AI Works

The Foundation Model Paradigm

Introduction & Foundations

Transfer learning is the reason you can get useful AI results without millions of training examples and a supercomputer budget. The idea is simple: take a model that has already been trained on a large, general dataset and adapt it to your specific task. Instead of starting from scratch, you start with a model that already understands language, recognises visual features, or grasps some other general domain, and you fine-tune it with your own smaller, more focused dataset. A language model pre-trained on billions of words of internet text already knows grammar, facts and reasoning patterns. Fine-tuning it on a few thousand examples of your customer service conversations is far easier than training a model from scratch. The same principle applies in computer vision, speech recognition, and many other domains. Transfer learning has democratised AI in a meaningful way. Before it became widespread, only organisations with massive datasets and computing budgets could build effective models. Now, a small team with a few hundred labelled examples can often achieve impressive results by fine-tuning a pre-trained model. The catch is that the pre-trained model's knowledge reflects its training data, including any biases, gaps or inaccuracies. What transfers is not just capability but also the model's inherited assumptions and blind spots.