Data & Infrastructure›Security & Privacy

AI Supply Chain Security

Explore further

Modern AI systems are built on a complex supply chain of pretrained models, datasets, libraries, and tools - most of which come from external sources. A typical AI application might use a pretrained model from Hugging Face, fine-tuned on a dataset from a third party, using libraries like PyTorch or TensorFlow, deployed on infrastructure managed by a cloud provider. Each link in this chain is a potential vulnerability. Malicious models can contain backdoors - hidden behaviours triggered by specific inputs. Compromised datasets can poison training. Vulnerabilities in ML frameworks can be exploited just like any other software dependency. The ML supply chain is in many ways less mature than the traditional software supply chain, with fewer established practices around signing, verification, and provenance. Organisations like Hugging Face have introduced model scanning and security features, but the ecosystem is still catching up. Practical steps include vetting the sources of pretrained models and datasets, scanning for known vulnerabilities in ML libraries, maintaining an inventory of your AI dependencies, and having a plan for responding to supply chain compromises. The same principles that apply to software supply chain security - know what you're using, keep it updated, verify its integrity, and plan for compromise - apply to AI, with the additional complexity that "verifying" a model's behaviour is fundamentally harder than verifying software code.