Prepare data for SHAP force plot (stacked bar chart) — shap.prep.stack.data • SHAPforxgboost

Transforms SHAP values into a format suitable for force plots, which show how features contribute to individual predictions. The function:

Ranks features by importance
Optionally combines less important features into 'rest_variables'
Clusters observations for better visualization
Assigns group labels for faceted plots

Usage

shap.prep.stack.data(
  shap_contrib,
  top_n = NULL,
  data_percent = 1,
  cluster_method = "ward.D",
  n_groups = 10L
)

Arguments

shap_contrib: shap_contrib is the SHAP value data returned from predict, here an ID variable is added for each observation in the shap_contrib dataset for better tracking, it is created in the begining as 1:nrow(shap_contrib). The ID matches the output from shap.prep
top_n: integer, optional to show only top_n features, combine the rest
data_percent: what percent of data to plot (to speed up the testing plot). The accepted input range is (0,1], if observations left is too few, there will be an error from the clustering function
cluster_method: default to ward.D, please refer to stats::hclust for details
n_groups: a integer, how many groups to plot in shap.plot.force_plot_bygroup

Value

a dataset for stack plot

Examples

# Example: SHAP force plots (stacked bar charts)
# Shows contribution of each feature to individual predictions

plot_data <- shap.prep.stack.data(shap_contrib = shap_values_iris,
                                  n_groups = 4)
#> All the features will be used.
shap.plot.force_plot(plot_data)
#> Data has N = 150 | zoom in length is 50 at location 90.

shap.plot.force_plot(plot_data, zoom_in_group = 2)
#> Data has N = 150 | zoom in at cluster 2 with N = 28.


# Plot all clusters separately
shap.plot.force_plot_bygroup(plot_data)