Machine Learning Interpretability in R

2018-09-28 00:00:00 AI Machine Learning h2oai

In this video the presenter goes over a new R package called ‘iML.’ This package has a lot of power when explaining global and local feature importance. These explanations are critical, especially in the health field and if your under GDPR regulations. Now, with the combination of Shapley, LIME, and partial dependence plots, you can figure out how the model works and why.

I think we’ll see a lot of innovation in the ‘model interpretation’ space going forward.

Notes from the video:

IML R package
ML models have huge potential but are complex and hard to understand
In critical conditions (life vs death), you need to explain your decision
Current tools for Model Interpretation: Decision Trees, Rules, Linear Regressions
Needs a model agnostic method
Feature Importance @ interpreted level for the global model
Compute generalization error on dataset and model
Scored features, what is the effect on that feature on the fitted model?
Fit a surrogate model
Generate Partial Dependence Plots (visualize the feature importance)
For Local Interpretation, use LIME
Now part of the R as iML package (written in R 6?)
What’s in the iml package? Permutation Feature Importance / Feature Interactions / Partial Dependence Plots / LIME / Shapley Values / Tree Surrogates
Shows the bike data set example