Skip to main content

Table 13 Parameters used for the implementation of the ML workflow

From: Data analysis with Shapley values for automatic subject selection in Alzheimer’s disease data sets using interpretable machine learning

Method

Hyperparameter

Values

RF

n_estimators

50

 

criterion

“gini”

 

max_depth

None

 

min_weight_fraction_leaf

0.0

 

max_features

“auto”

 

max_leaf_nodes

None

 

min_impurity_decrease

0.0

 

min_impurity_split

None

 

bootstrap

True

 

oob_score

False

 

class_weight

None

 

ccp_alpha

0.0

 

max_samples

None

XGBoost

subsample

0.6

 

objective

“binary:logistic”

 

booster

“gbtree”

 

eta

0.3

 

gamma

0

 

max_depth

6

 

min_child_weight

1

 

max_delta_step

0

 

sampling_method

“uniform”

 

colsample_bytree

1

 

colsample_bylevel

1

 

colsample_bynode

1

 

lambda

1

 

alpha

0

 

tree_method

“auto”

 

sketch_eps

0.03

 

scale_pos_weight

1

 

updater

“grow_colmaker,prune”

 

refresh_leaf

1

 

process_type

“default”

 

grow_policy

“depthwise”

 

max_leaves

0

 

max_bin

256

 

predictor

“auto”

 

num_parallel_tree

1

LR

solver

“liblinear”

 

penalty

“l2”

 

dual

False

 

tol

1e-4

 

C

1.0

 

fit_intercept

True

 

intercept_scaling

1

 

class_weight

None

 

max_iter

5000

 

multi_class

“auto”

 

warm_start

False

 

l1_ratio

None

Data Shapley

number of repetitions

4

 

model_family

{“RandomForest”, “logistic”}

 

metric

“accuracy”

 

num_test

108

 

problem

“classification”

 

sample weights

None

 

save_every

100

 

err

0.1

 

tolerance

0.01

 

g_run

False

 

loo_run

True

Kernel SHAP

nsample

3000

 

l1_reg

“auto”

 

link

“identity”