Skip to main content

Table 13 Parameters used for the implementation of the ML workflow

From: Data analysis with Shapley values for automatic subject selection in Alzheimer’s disease data sets using interpretable machine learning

Method Hyperparameter Values
RF n_estimators 50
  criterion “gini”
  max_depth None
  min_weight_fraction_leaf 0.0
  max_features “auto”
  max_leaf_nodes None
  min_impurity_decrease 0.0
  min_impurity_split None
  bootstrap True
  oob_score False
  class_weight None
  ccp_alpha 0.0
  max_samples None
XGBoost subsample 0.6
  objective “binary:logistic”
  booster “gbtree”
  eta 0.3
  gamma 0
  max_depth 6
  min_child_weight 1
  max_delta_step 0
  sampling_method “uniform”
  colsample_bytree 1
  colsample_bylevel 1
  colsample_bynode 1
  lambda 1
  alpha 0
  tree_method “auto”
  sketch_eps 0.03
  scale_pos_weight 1
  updater “grow_colmaker,prune”
  refresh_leaf 1
  process_type “default”
  grow_policy “depthwise”
  max_leaves 0
  max_bin 256
  predictor “auto”
  num_parallel_tree 1
LR solver “liblinear”
  penalty “l2”
  dual False
  tol 1e-4
  C 1.0
  fit_intercept True
  intercept_scaling 1
  class_weight None
  max_iter 5000
  multi_class “auto”
  warm_start False
  l1_ratio None
Data Shapley number of repetitions 4
  model_family {“RandomForest”, “logistic”}
  metric “accuracy”
  num_test 108
  problem “classification”
  sample weights None
  save_every 100
  err 0.1
  tolerance 0.01
  g_run False
  loo_run True
Kernel SHAP nsample 3000
  l1_reg “auto”
  link “identity”