Model Hosting & Registries

A model registry is a centralised system for storing, versioning, and managing trained models. Think of it as a structured catalogue of every model your organisation has built, complete with metadata about training data, performance metrics, deployment status, and lineage. Without a registry, organisations quickly lose track of which models are deployed where, what data they were trained on, and which version is running in production. Major platforms include MLflow's model registry, Amazon SageMaker Model Registry, Azure ML's model catalogue, and Weights & Biases. These tools let you store model artefacts, compare performance across versions, manage approval workflows for production deployment, and roll back to previous versions when needed. Model hosting - the infrastructure that actually serves models to applications - is a related but distinct concern. Options range from managed services like SageMaker endpoints or Azure ML managed endpoints, to self-managed solutions using frameworks like TorchServe, Triton Inference Server, or vLLM for large language models. The hosting choice affects latency, throughput, cost, and operational complexity. For organisations running multiple models in production, a proper registry and hosting infrastructure isn't optional - it's the difference between managing your AI portfolio systematically and losing track of what's running, what it does, and whether it still works as intended.