Challenge

Being able to provide reliable predictions is a valuable asset for any business, whether for fraud detection, ad click prediction or sales volumes. But even more valuable is the capability to understand why a model makes certain predictions. This understanding can be leveraged to make appropriate business decisions.

We were asked to develop a model to forecast the sales of a major UK retailer that would be explainable, enabling them to find how to best allocate their marketing budget between different departments.

Solution

When it comes to explainability, multiple approaches are possible. You can opt for a simple model like linear or logistic regression, where the coefficients will provide you sufficient information, but that usually comes with a trade-off on the model’s performance. Or you can choose a black box model like ensemble models or neural networks, and you will then have to pair it with an explainability technique like LIME and SHAP.

Another question to address is what type of explanation do you want to have: global or local ? Global explanations enable you to make sense of which features are overall impacting the model’s decision, whereas local will provide you explanation for every point.

We were asked to provide as much information as possible and thus opted to include both global and local explanation.

Having modelled our time series using RandomForest, global explanation was achieved using Permutation Importance and Random Forest Feature importance. Permutation shuffle the values feature by feature and and observes how much the model’s result is impacted by this permutation. The idea being that shuffling an important feature should have a consequent impact on the model. Random Forest feature importance is obtained by averaging the decrease of impurity caused by a feature over the trees.

For local explanation, we used the SHAP framework (more specifically the tree explainer module). SHAP applies the idea of Shapley values to machine learning models. Shapley values come from game theory where they are used to evaluate the contribution of different individuals to a community or a group.

We would suggest to make sure your explanations are reliable you either have them checked with someone who has a good grasp of the business or if this is not an option, try compare the results returned by multiple explainability techniques.

Resources on explainability: https://christophm.github.io/interpretable-ml-book/

This study was completed over 6 weeks.