Prompt Tuning & Soft Prompts

Prompt tuning is a lightweight adaptation technique that sits between prompt engineering (changing the text of the message you send) and full fine-tuning (retraining the model). Instead of crafting a text prompt or modifying model weights, you train a small set of "soft prompt" vectors - numerical representations that get prepended to your input before the model processes it. These soft prompts aren't human-readable words; they're optimised numbers that steer the model's behaviour in the direction you want. Think of it as finding the perfect instruction in the model's own internal language, rather than trying to express that instruction in English. The trained soft prompts are tiny - often just a few kilobytes - making them extremely cheap to store and swap between tasks. You can have hundreds of task-specific soft prompts sharing a single base model. The trade-off is that prompt tuning typically doesn't match the performance of full fine-tuning or even LoRA for complex adaptations. It works best for steering the model's style, format or focus rather than teaching it genuinely new knowledge. For businesses with many similar but distinct use cases - different product lines, regions or customer segments - prompt tuning offers an efficient way to maintain customised behaviour without the overhead of multiple fine-tuned models.