Explainability (Post-Hoc Explanations)

Most AI explanations you encounter are post-hoc - generated after the model has already made its decision, offering a plausible account of why it reached that conclusion. Techniques like SHAP values, LIME, and attention visualisations don't show you how the model actually computed its answer; they show you which input features were most associated with the output. Think of it like asking someone why they chose a particular restaurant. Their answer ("great reviews, close to the office") might be a reasonable reconstruction, but it may not capture the actual decision process (which might have been "it was the first one I thought of"). Post-hoc explanations face a fundamental accuracy-simplicity trade-off. A truly faithful explanation of a large neural network's reasoning would be incomprehensible to most people. A simple, human-friendly explanation necessarily leaves things out and may even be misleading. For practical purposes, post-hoc explanations are still valuable - they can highlight unexpected factors, catch obvious errors, and give users a mental model for how the system behaves. Just don't mistake them for a window into the machine's actual reasoning process.