Skip to content

matbench_v0.1: AMMExpress v2020

Algorithm description:

Automatminer express v1.03.20200727. Based on automatic featurization, tree-based feature reduction, and genetic-algorithm based AutoML with the TPOT package.

All data was generated using the same config (express, default). The automatminer version requirement specifies the versions of many dependent packages, such as matminer, which are required for the algorithm to work in your virtualenv.

Raw data download and example notebook available on the matbench repo.

References (in bibtex format):

('@article{Dunn2020,\n'
 '  doi = {10.1038/s41524-020-00406-3},\n'
 '  url = {https://doi.org/10.1038/s41524-020-00406-3},\n'
 '  year = {2020},\n'
 '  month = sep,\n'
 '  publisher = {Springer Science and Business Media {LLC}},\n'
 '  volume = {6},\n'
 '  number = {1},\n'
 '  author = {Alexander Dunn and Qi Wang and Alex Ganose and Daniel Dopp and '
 'Anubhav Jain},\n'
 '  title = {Benchmarking materials property prediction methods: the Matbench '
 'test set and Automatminer reference algorithm},\n'
 '  journal = {npj Computational Materials}\n'
 '}')

User metadata:

{'autofeaturizer_kwargs': {'n_jobs': 10, 'preset': 'express'},
 'cleaner_kwargs': {'feature_na_method': 'drop',
                    'max_na_frac': 0.1,
                    'na_method_fit': 'mean',
                    'na_method_transform': 'mean'},
 'learner_kwargs': {'max_eval_time_mins': 20,
                    'max_time_mins': 1440,
                    'memory': 'auto',
                    'n_jobs': 10,
                    'population_size': 200},
 'learner_name': 'TPOTAdaptor',
 'reducer_kwargs': {'reducers': ['corr', 'tree'],
                    'tree_importance_percentile': 0.99}}

Metadata:

Tasks recorded: 13 of 13 total

Benchmark is complete? True

Software Requirements

{'python': ['automatminer==1.0.3.20200727', 'matbench==0.1.0']}

Task data:

matbench_dielectric

Fold scores
fold mae rmse mape* max_error
fold_0 0.2188 0.6855 0.0760 14.6654
fold_1 0.2844 1.0764 0.0899 19.6283
fold_2 0.4257 2.9472 0.0889 59.0112
fold_3 0.3198 2.2782 0.0720 53.5196
fold_4 0.3264 1.6137 0.0987 28.1601
Fold score stats
metric mean max min std
mae 0.3150 0.4257 0.2188 0.0672
rmse 1.7202 2.9472 0.6855 0.8140
mape* 0.0851 0.0987 0.0720 0.0098
max_error 34.9969 59.0112 14.6654 17.9782
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.006, score_func=<function f_regression at 0x2aaaef1a0840>))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.95, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=5,\n max_features=0.45, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=14,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData mean MeltingT', 'MagpieData mode MeltingT', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'maximum oxidation state', 'std_dev oxidation state', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 19', 'structural complexity per cell', 'crystal_system_tetragonal']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(zerocount, ZeroCount())', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=3,\n max_features=0.2, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=1.0, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData avg_dev Column', 'MagpieData mean CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NdValence', 'MagpieData range NValence', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'range oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 7', 'structural complexity per atom']}
fold_2 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.001))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.8, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=9,\n max_features=0.55, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData mean MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 8', 'sine coulomb matrix eig 19', 'structural complexity per atom']}
fold_3 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.023, score_func=<function f_regression at 0x2aaaef19f950>))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=7,\n max_features=0.7500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=16, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.6500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData avg_dev Column', 'MagpieData minimum CovalentRadius', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData maximum NdValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData range NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'maximum oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 7', 'structural complexity per atom', 'structural complexity per cell', 'crystal_system_tetragonal']}
fold_4 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.034, score_func=<function f_regression at 0x2aaaf35a08c8>))', '(zerocount, ZeroCount())', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.85, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=9,\n max_features=0.7500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=13, min_samples_split=17,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.7500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData mean MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData range NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'maximum oxidation state', 'avg ionic char', 'density', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 7']}

matbench_expt_gap

Fold scores
fold mae rmse mape* max_error
fold_0 0.3998 0.9435 0.3372 8.0111
fold_1 0.4061 0.9354 0.3085 8.6887
fold_2 0.4538 1.0955 0.3916 12.7533
fold_3 0.4061 1.0273 0.3019 12.6296
fold_4 0.4150 0.9573 0.4503 6.0779
Fold score stats
metric mean max min std
mae 0.4161 0.4538 0.3998 0.0194
rmse 0.9918 1.0955 0.9354 0.0612
mape* 0.3579 0.4503 0.3019 0.0560
max_error 9.6321 12.7533 6.0779 2.6411
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.035, score_func=<function f_regression at 0x2aaaf35a18c8>))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=ls, max_depth=9,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData avg_dev Number', 'MagpieData range MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData mean NfValence', 'MagpieData minimum NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev GSmagmom', 'MagpieData range SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_1 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.046, score_func=<function f_regression at 0x2aaaef19f8c8>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, true, false, false, false, false, ...],\n dtype=<class float>, minimum_fraction=0.1, sparse=false,\n threshold=10))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.85, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=9,\n max_features=0.8, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=1.0, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData avg_dev Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData maximum NdValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData maximum NpUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_2 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0005))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.5500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData avg_dev Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NfValence', 'MagpieData minimum NValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_3 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=85,\n score_func=<function f_regression at 0x2aaaf39a38c8>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false, ...],\n dtype=<class float>, minimum_fraction=0.2, sparse=false,\n threshold=10))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.8, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=7,\n max_features=0.9500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=20,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.8, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_4 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0005))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData minimum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber']}

matbench_expt_is_metal

Fold scores
fold accuracy balanced_accuracy f1 rocauc
fold_0 0.9218 0.9218 0.9205 0.9218
fold_1 0.9157 0.9156 0.9145 0.9156
fold_2 0.9207 0.9207 0.9193 0.9207
fold_3 0.9228 0.9228 0.9223 0.9228
fold_4 0.9238 0.9238 0.9235 0.9238
Fold score stats
metric mean max min std
accuracy 0.9210 0.9238 0.9157 0.0028
balanced_accuracy 0.9209 0.9238 0.9156 0.0028
f1 0.9200 0.9235 0.9145 0.0031
rocauc 0.9209 0.9238 0.9156 0.0028
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.009000000000000001,\n score_func=<function f_classif at 0x2aaaf35a16a8>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false, ...],\n dtype=<class float>, minimum_fraction=0.2, sparse=false,\n threshold=10))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=9,\n max_features=0.25000000000000006,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=13,\n min_samples_split=5, min_weight_fraction_leaf=0.0,\n n_estimators=500, n_iter_no_change=null,\n presort=auto, random_state=null,\n subsample=0.7500000000000002, tol=0.0001,\n validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData mean AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData range GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_1 {'best_pipeline': ['(rfe, RFE(estimator=ExtraTreesClassifier(bootstrap=false, class_weight=null,\n criterion=gini, max_depth=null,\n max_features=0.15000000000000002,\n max_leaf_nodes=null,\n min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1,\n min_samples_split=2,\n min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null,\n verbose=0, warm_start=false),\n n_features_to_select=null, step=0.35000000000000003, verbose=0))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=5,\n max_features=0.8500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=17,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.45000000000000007,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData maximum NpValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_2 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.001))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=9,\n max_features=0.6500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=10, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9500000000000002,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData maximum NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_3 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.03, score_func=<function f_classif at 0x2aaaf35a0730>))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=7,\n max_features=0.05, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=16, min_samples_split=17,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.8500000000000002,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData maximum Column', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'max ionic char', 'avg ionic char']}
fold_4 {'best_pipeline': ['(rfe, RFE(estimator=ExtraTreesClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null,\n max_features=0.5500000000000002,\n max_leaf_nodes=null,\n min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1,\n min_samples_split=2,\n min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null,\n verbose=0, warm_start=false),\n n_features_to_select=null, step=0.15000000000000002, verbose=0))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=9,\n max_features=0.5500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=7, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.8500000000000002,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData maximum Column', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData maximum NpValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData range GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'max ionic char', 'avg ionic char']}

matbench_glass

Fold scores
fold accuracy balanced_accuracy f1 rocauc
fold_0 0.8283 0.8441 0.8697 0.8441
fold_1 0.8125 0.8383 0.8548 0.8383
fold_2 0.8574 0.8546 0.8956 0.8546
fold_3 0.9173 0.8742 0.9437 0.8742
fold_4 0.9375 0.8921 0.9579 0.8921
Fold score stats
metric mean max min std
accuracy 0.8706 0.9375 0.8125 0.0490
balanced_accuracy 0.8607 0.8921 0.8383 0.0199
f1 0.9043 0.9579 0.8548 0.0404
rocauc 0.8607 0.8921 0.8383 0.0199
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfrommodel, SelectFromModel(estimator=ExtraTreesClassifier(bootstrap=false,\n class_weight=null,\n criterion=entropy,\n max_depth=null,\n max_features=0.35000000000000003,\n max_leaf_nodes=null,\n min_impurity_decrease=0.0,\n min_impurity_split=null,\n min_samples_leaf=1,\n min_samples_split=2,\n min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null,\n oob_score=false,\n random_state=null, verbose=0,\n warm_start=false),\n max_features=null, norm_order=1, prefit=false, threshold=0.0))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=7,\n max_features=0.05, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9500000000000002,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter', 'Miedema_deltaH_amor', 'Miedema_deltaH_ss_min']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(extratreesclassifier, ExtraTreesClassifier(bootstrap=false, class_weight=null, criterion=gini,\n max_depth=null, max_features=0.35000000000000003,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1,\n min_samples_split=2, min_weight_fraction_leaf=0.0,\n n_estimators=20, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mean AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter', 'Miedema_deltaH_amor', 'Miedema_deltaH_ss_min']}
fold_2 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=74,\n score_func=<function f_classif at 0x2aaaf35a0730>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false, ...],\n dtype=<class float>, minimum_fraction=0.2, sparse=false,\n threshold=10))', '(extratreesclassifier, ExtraTreesClassifier(bootstrap=false, class_weight=null, criterion=entropy,\n max_depth=null, max_features=0.6500000000000001,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1,\n min_samples_split=2, min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData mean Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter', 'Miedema_deltaH_amor', 'Miedema_deltaH_ss_min']}
fold_3 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=9,\n max_features=0.15000000000000002,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=13,\n min_samples_split=17, min_weight_fraction_leaf=0.0,\n n_estimators=1000, n_iter_no_change=null,\n presort=auto, random_state=null,\n subsample=0.5500000000000002, tol=0.0001,\n validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum AtomicWeight', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter', 'Miedema_deltaH_amor', 'Miedema_deltaH_ss_min']}
fold_4 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(gradientboostingclassifier, GradientBoostingClassifier(criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=deviance, max_depth=7,\n max_features=0.45000000000000007,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=16,\n min_samples_split=5, min_weight_fraction_leaf=0.0,\n n_estimators=500, n_iter_no_change=null,\n presort=auto, random_state=null,\n subsample=0.7500000000000002, tol=0.0001,\n validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum AtomicWeight', 'MagpieData range AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData maximum NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter', 'Miedema_deltaH_amor', 'Miedema_deltaH_ss_min']}

matbench_jdft2d

Fold scores
fold mae rmse mape* max_error
fold_0 29.5070 57.7719 18.9726 362.2752
fold_1 44.3036 98.1137 0.3191 551.7742
fold_2 54.4690 164.0162 0.5117 847.0618
fold_3 28.0759 55.8345 0.2371 316.2185
fold_4 42.8931 156.9938 0.5429 1552.9102
Fold score stats
metric mean max min std
mae 39.8497 54.4690 28.0759 9.8835
rmse 106.5460 164.0162 55.8345 46.6251
mape* 4.1167 18.9726 0.2371 7.4289
max_error 726.0480 1552.9102 316.2185 453.6535
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.8, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=lad, max_depth=3,\n max_features=0.7000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.7000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData avg_dev Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData minimum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData mean Row', 'MagpieData maximum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NValence', 'MagpieData mean NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'avg anion electron affinity', 'density', 'vpa', 'packing fraction', 'crystal_system_int', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'structural complexity per cell']}
fold_1 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=40,\n score_func=<function f_regression at 0x2aaaf35a08c8>))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=lad, max_depth=7,\n max_features=0.15000000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData avg_dev Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData maximum AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData avg_dev Row', 'MagpieData maximum CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData mode NdUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'maximum oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'structural complexity per cell']}
fold_2 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=62,\n score_func=<function f_regression at 0x2aaaf35a08c8>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n true, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, true, false, false, false],\n dtype=<class float>, minimum_fraction=0.1, sparse=false,\n threshold=10))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.95, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=lad, max_depth=3,\n max_features=0.4, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.7000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData avg_dev Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData minimum MeltingT', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData range NpUnfilled', 'MagpieData mode NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData minimum SpaceGroupNumber', 'MagpieData maximum SpaceGroupNumber', 'MagpieData range SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'std_dev oxidation state', 'avg anion electron affinity', 'density', 'vpa', 'packing fraction', 'crystal_system_int', 'is_centrosymmetric', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'structural complexity per atom', 'structural complexity per cell']}
fold_3 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=82,\n score_func=<function f_regression at 0x2aab561f6620>))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.85, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=5,\n max_features=0.4, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=1.0, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData maximum Number', 'MagpieData avg_dev Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NfValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSbandgap', 'MagpieData minimum SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'structural complexity per atom', 'structural complexity per cell']}
fold_4 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=62,\n score_func=<function f_regression at 0x2aaaf35a08c8>))', '(zerocount, ZeroCount())', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.9, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=lad, max_depth=5,\n max_features=0.55, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=4, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.8500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData avg_dev Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mode MeltingT', 'MagpieData range Column', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData range Row', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'avg anion electron affinity', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'structural complexity per atom', 'structural complexity per cell', 'crystal_system_tetragonal']}

matbench_log_gvrh

Fold scores
fold mae rmse mape* max_error
fold_0 0.0891 0.1270 0.0692 1.1580
fold_1 0.0852 0.1261 0.0666 1.0887
fold_2 0.0849 0.1261 0.0668 0.9631
fold_3 0.0884 0.1279 0.0670 0.8959
fold_4 0.0894 0.1313 0.0690 0.9810
Fold score stats
metric mean max min std
mae 0.0874 0.0894 0.0849 0.0020
rmse 0.1277 0.1313 0.1261 0.0019
mape* 0.0677 0.0692 0.0666 0.0012
max_error 1.0173 1.1580 0.8959 0.0937
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.2))', '(zerocount, ZeroCount())', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.99, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=ls, max_depth=7,\n max_features=0.4, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.6500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData mean AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.01))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=5,\n max_features=1.0, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.55, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 3']}
fold_2 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.01, score_func=<function f_regression at 0x2aaaef19e8c8>))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.25000000000000006, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData mean NdValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 3']}
fold_3 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=ls, max_depth=9,\n max_features=0.1, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=14,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.8, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData mean AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 3']}
fold_4 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=96,\n score_func=<function f_regression at 0x2aaaf35a08c8>))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.9500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData mean Column', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 3']}

matbench_log_kvrh

Fold scores
fold mae rmse mape* max_error
fold_0 0.0639 0.1179 0.0417 1.4823
fold_1 0.0659 0.1231 0.0432 1.2686
fold_2 0.0627 0.1115 0.0411 1.1316
fold_3 0.0668 0.1217 0.0464 1.1890
fold_4 0.0640 0.1172 0.0417 1.4335
Fold score stats
metric mean max min std
mae 0.0647 0.0668 0.0627 0.0015
rmse 0.1183 0.1231 0.1115 0.0041
mape* 0.0428 0.0464 0.0411 0.0019
max_error 1.3010 1.4823 1.1316 0.1362
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.032, score_func=<function f_regression at 0x2aaaf35a2840>))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.8500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData minimum NValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData maximum SpaceGroupNumber', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0']}
fold_1 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.029, score_func=<function f_regression at 0x2aaaf35a08c8>))', '(zerocount, ZeroCount())', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.9, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=9,\n max_features=0.7000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=13, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData minimum Column', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData minimum NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 0']}
fold_2 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.2))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false],\n dtype=<class float>, minimum_fraction=0.15, sparse=false,\n threshold=10))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 2']}
fold_3 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.016, score_func=<function f_regression at 0x2aaaf79a28c8>))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, false, false],\n dtype=<class float>, minimum_fraction=0.25, sparse=false,\n threshold=10))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.9500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData minimum Column', 'MagpieData mean Column', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData minimum NValence', 'MagpieData mean NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData maximum SpaceGroupNumber', 'density', 'vpa', 'packing fraction', 'spacegroup_num']}
fold_4 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.6500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData maximum MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData mean Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mean NsValence', 'MagpieData mean NpValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean SpaceGroupNumber', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 0']}

matbench_mp_e_form

Fold scores
fold mae rmse mape* max_error
fold_0 0.1586 0.2508 1.0829 4.0713
fold_1 0.2026 0.2955 0.9253 5.8108
fold_2 0.1473 0.2256 0.7722 2.7696
fold_3 0.2080 0.3062 1.3958 5.5190
fold_4 0.1467 0.2226 0.8028 3.3888
Fold score stats
metric mean max min std
mae 0.1726 0.2080 0.1467 0.0270
rmse 0.2602 0.3062 0.2226 0.0348
mape* 0.9958 1.3958 0.7722 0.2280
max_error 4.3119 5.8108 2.7696 1.1826
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.5, loss=huber, max_depth=5,\n max_features=0.15000000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=12, min_samples_split=15,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto, random_state=null,\n subsample=0.7500000000000001, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData mean melting_point', 'MagpieData mode MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData maximum Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MatscholarElementData mean embedding 30', 'MatscholarElementData mean embedding 31', 'MatscholarElementData range embedding 41', 'MatscholarElementData minimum embedding 51', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData minimum embedding 72', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 80', 'MatscholarElementData minimum embedding 104', 'MatscholarElementData mean embedding 157', 'MatscholarElementData maximum embedding 166', 'MatscholarElementData mean embedding 166', 'MatscholarElementData mean embedding 176', 'DemlData mean boiling_point', 'O', '2-norm', 'frac d valence electrons', 'density', 'vpa']}
fold_1 {'best_pipeline': ['(polynomialfeatures, PolynomialFeatures(degree=2, include_bias=false, interaction_only=false))', '(pca, PCA(copy=true, iterated_power=3, n_components=null, random_state=null,\n svd_solver=randomized, tol=0.0, whiten=false))', '(lassolarscv, LassoLarsCV(copy_X=true, cv=warn, eps=2.220446049250313e-16,\n fit_intercept=true, max_iter=500, max_n_alphas=1000, n_jobs=null,\n normalize=true, positive=false, precompute=auto, verbose=false))'], 'features_reduced': ['PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData mean melting_point', 'PymatgenData std_dev melting_point', 'MagpieData mode MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData maximum Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MatscholarElementData mean embedding 31', 'MatscholarElementData mean embedding 32', 'MatscholarElementData range embedding 41', 'MatscholarElementData minimum embedding 51', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData minimum embedding 72', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 80', 'MatscholarElementData minimum embedding 86', 'MatscholarElementData mean embedding 157', 'MatscholarElementData mean embedding 166', 'MatscholarElementData mean embedding 176', 'MatscholarElementData mean embedding 188', 'DemlData mean boiling_point', 'O', '2-norm', 'frac d valence electrons', 'density', 'vpa']}
fold_2 {'best_pipeline': ['(stackingestimator, StackingEstimator(estimator=GradientBoostingRegressor(alpha=0.9, criterion=friedman_mse, init=null,\n learning_rate=0.5, loss=huber, max_depth=4,\n max_features=0.6000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_...e=0.7500000000000001, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false)))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(polynomialfeatures, PolynomialFeatures(degree=2, include_bias=false, interaction_only=false))', '(ridgecv, RidgeCV(alphas=array([ 0.1, 1. , 10. ]), cv=null, fit_intercept=true,\n gcv_mode=null, normalize=false, scoring=null, store_cv_values=false))'], 'features_reduced': ['PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData mean melting_point', 'MagpieData mode MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData maximum Column', 'MagpieData avg_dev Column', 'MagpieData mode Column', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MatscholarElementData mean embedding 8', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 68', 'MatscholarElementData minimum embedding 72', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 80', 'MatscholarElementData std_dev embedding 90', 'MatscholarElementData mean embedding 157', 'MatscholarElementData mean embedding 166', 'MatscholarElementData mean embedding 176', 'MatscholarElementData mean embedding 188', 'DemlData mean boiling_point', 'O', '2-norm', 'frac d valence electrons', 'density', 'vpa', 'packing fraction']}
fold_3 {'best_pipeline': ['(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(selectfwe, SelectFwe(alpha=0.027, score_func=<function f_regression at 0x2b2eb18422f0>))', '(stackingestimator, StackingEstimator(estimator=GradientBoostingRegressor(alpha=0.8, criterion=friedman_mse, init=null,\n learning_rate=0.5, loss=huber, max_depth=3,\n max_features=0.1, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=14, min_sa...e=0.6000000000000001, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false)))', '(ridgecv, RidgeCV(alphas=array([ 0.1, 1. , 10. ]), cv=null, fit_intercept=true,\n gcv_mode=null, normalize=false, scoring=null, store_cv_values=false))'], 'features_reduced': ['PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData mean melting_point', 'MagpieData minimum MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData maximum Column', 'MagpieData avg_dev Column', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MatscholarElementData range embedding 41', 'MatscholarElementData mean embedding 41', 'MatscholarElementData minimum embedding 51', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData mean embedding 68', 'MatscholarElementData minimum embedding 72', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 80', 'MatscholarElementData minimum embedding 104', 'MatscholarElementData mean embedding 157', 'MatscholarElementData maximum embedding 166', 'MatscholarElementData mean embedding 176', 'O', '2-norm', 'frac d valence electrons', 'density', 'vpa']}
fold_4 {'best_pipeline': ['(xgbregressor, XGBRegressor(base_score=0.5, booster=gbtree, colsample_bylevel=1,\n colsample_bytree=1, gamma=0, learning_rate=0.5, max_delta_step=0,\n max_depth=5, min_child_weight=14, missing=null, n_estimators=100,\n n_jobs=1, nthread=1, objective=reg:linear, random_state=0,\n reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=null,\n silent=true, subsample=0.9000000000000001))'], 'features_reduced': ['PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData mean melting_point', 'MagpieData mode MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData maximum Column', 'MagpieData avg_dev Column', 'MagpieData mode CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MatscholarElementData mean embedding 8', 'MatscholarElementData mean embedding 30', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 51', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 65', 'MatscholarElementData minimum embedding 72', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 80', 'MatscholarElementData minimum embedding 86', 'MatscholarElementData minimum embedding 104', 'MatscholarElementData mean embedding 157', 'MatscholarElementData mean embedding 166', 'MatscholarElementData mean embedding 176', 'MatscholarElementData range embedding 182', 'MatscholarElementData mean embedding 188', 'DemlData mean boiling_point', 'O', '2-norm', 'frac d valence electrons', 'density', 'vpa', 'packing fraction']}

matbench_mp_gap

Fold scores
fold mae rmse mape* max_error
fold_0 0.2799 0.5481 3.5712 5.4792
fold_1 0.2850 0.5671 3.1533 6.9105
fold_2 0.2724 0.5477 4.6097 6.2045
fold_3 0.2909 0.5710 10.0191 6.4590
fold_4 0.2837 0.5714 6.8322 5.5333
Fold score stats
metric mean max min std
mae 0.2824 0.2909 0.2724 0.0061
rmse 0.5611 0.5714 0.5477 0.0109
mape* 5.6371 10.0191 3.1533 2.5347
max_error 6.1173 6.9105 5.4792 0.5480
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(stackingestimator-1, StackingEstimator(estimator=RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.4, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=6, min_samples_split=10,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(stackingestimator-2, StackingEstimator(estimator=ExtraTreesRegressor(bootstrap=true, criterion=mse, max_depth=null,\n max_features=0.55, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=6, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(lassolarscv, LassoLarsCV(copy_X=true, cv=warn, eps=2.220446049250313e-16,\n fit_intercept=true, max_iter=500, max_n_alphas=1000, n_jobs=null,\n normalize=true, positive=false, precompute=auto, verbose=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev group', 'PymatgenData maximum mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev thermal_conductivity', 'MagpieData avg_dev MeltingT', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData std_dev embedding 2', 'MatscholarElementData minimum embedding 40', 'MatscholarElementData minimum embedding 41', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 45', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData std_dev embedding 60', 'MatscholarElementData minimum embedding 64', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 71', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData maximum embedding 82', 'MatscholarElementData mean embedding 82', 'MatscholarElementData mean embedding 93', 'MatscholarElementData maximum embedding 114', 'MatscholarElementData mean embedding 116', 'MatscholarElementData std_dev embedding 119', 'MatscholarElementData std_dev embedding 123', 'MatscholarElementData mean embedding 143', 'MatscholarElementData mean embedding 148', 'MatscholarElementData mean embedding 163', 'MatscholarElementData std_dev embedding 163', 'MatscholarElementData mean embedding 168', 'MatscholarElementData std_dev embedding 170', 'MatscholarElementData mean embedding 179', 'MatscholarElementData minimum embedding 181', 'MatscholarElementData mean embedding 181', 'MatscholarElementData std_dev embedding 192', 'MatscholarElementData mean embedding 194', 'MatscholarElementData mean embedding 198', 'MatscholarElementData maximum embedding 199', 'DemlData std_dev heat_fusion', 'DemlData minimum boiling_point', 'DemlData mean boiling_point', 'DemlData mean electronegativity', 'DemlData std_dev electronegativity', '2-norm', 'transition metal fraction', 'frac p valence electrons', 'frac d valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'HOMO_character_d']}
fold_1 {'best_pipeline': ['(stackingestimator-1, StackingEstimator(estimator=RandomForestRegressor(bootstrap=true, criterion=mse, max_depth=null,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=16, min_samples_split=6,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(stackingestimator-2, StackingEstimator(estimator=RandomForestRegressor(bootstrap=true, criterion=mse, max_depth=null,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=6, min_samples_split=15,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(decisiontreeregressor, DecisionTreeRegressor(criterion=mse, max_depth=9, max_features=null,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=8,\n min_samples_split=2, min_weight_fraction_leaf=0.0,\n presort=false, random_state=null, splitter=best))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev X', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev thermal_conductivity', 'PymatgenData std_dev melting_point', 'MagpieData range MeltingT', 'MagpieData range Column', 'MagpieData mean Row', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 2', 'MatscholarElementData std_dev embedding 2', 'MatscholarElementData minimum embedding 41', 'MatscholarElementData range embedding 41', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 45', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData minimum embedding 64', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 66', 'MatscholarElementData mean embedding 71', 'MatscholarElementData mean embedding 79', 'MatscholarElementData maximum embedding 82', 'MatscholarElementData mean embedding 82', 'MatscholarElementData std_dev embedding 83', 'MatscholarElementData mean embedding 90', 'MatscholarElementData mean embedding 93', 'MatscholarElementData minimum embedding 112', 'MatscholarElementData mean embedding 116', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData std_dev embedding 119', 'MatscholarElementData std_dev embedding 122', 'MatscholarElementData mean embedding 123', 'MatscholarElementData std_dev embedding 123', 'MatscholarElementData mean embedding 132', 'MatscholarElementData std_dev embedding 141', 'MatscholarElementData mean embedding 143', 'MatscholarElementData mean embedding 148', 'MatscholarElementData mean embedding 156', 'MatscholarElementData mean embedding 163', 'MatscholarElementData std_dev embedding 163', 'MatscholarElementData minimum embedding 166', 'MatscholarElementData mean embedding 168', 'MatscholarElementData mean embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData std_dev embedding 192', 'MatscholarElementData mean embedding 194', 'MatscholarElementData mean embedding 195', 'MatscholarElementData mean embedding 198', 'MatscholarElementData maximum embedding 199', 'DemlData minimum boiling_point', 'DemlData mean boiling_point', 'DemlData minimum heat_cap', 'DemlData mean electronegativity', '2-norm', 'transition metal fraction', 'frac p valence electrons', 'frac d valence electrons', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 0', 'HOMO_character_d']}
fold_2 {'best_pipeline': ['(stackingestimator-1, StackingEstimator(estimator=RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.45, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=9, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(stackingestimator-2, StackingEstimator(estimator=RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.3, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=16, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.8, max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1, min_samples_split=3,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'LUMO_energy', 'gap_AO', 'PymatgenData std_dev X', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData std_dev thermal_conductivity', 'MagpieData range MeltingT', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev NpValence', 'MagpieData mean NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 2', 'MatscholarElementData std_dev embedding 2', 'MatscholarElementData maximum embedding 10', 'MatscholarElementData minimum embedding 41', 'MatscholarElementData range embedding 41', 'MatscholarElementData mean embedding 41', 'MatscholarElementData mean embedding 45', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData mean embedding 63', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 71', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData std_dev embedding 78', 'MatscholarElementData mean embedding 82', 'MatscholarElementData std_dev embedding 83', 'MatscholarElementData mean embedding 91', 'MatscholarElementData mean embedding 93', 'MatscholarElementData std_dev embedding 97', 'MatscholarElementData mean embedding 116', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData std_dev embedding 119', 'MatscholarElementData mean embedding 123', 'MatscholarElementData std_dev embedding 123', 'MatscholarElementData minimum embedding 130', 'MatscholarElementData std_dev embedding 130', 'MatscholarElementData mean embedding 143', 'MatscholarElementData mean embedding 163', 'MatscholarElementData minimum embedding 166', 'MatscholarElementData mean embedding 168', 'MatscholarElementData std_dev embedding 170', 'MatscholarElementData mean embedding 181', 'MatscholarElementData std_dev embedding 192', 'MatscholarElementData mean embedding 194', 'MatscholarElementData mean embedding 198', 'MatscholarElementData maximum embedding 199', 'DemlData minimum molar_vol', 'DemlData std_dev heat_fusion', 'DemlData minimum boiling_point', 'DemlData mean boiling_point', 'DemlData mean heat_cap', 'DemlData mean electronegativity', '2-norm', 'transition metal fraction', 'frac p valence electrons', 'frac d valence electrons', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'HOMO_character_d']}
fold_3 {'best_pipeline': ['(stackingestimator-1, StackingEstimator(estimator=ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.45, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=20,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(stackingestimator-2, StackingEstimator(estimator=RandomForestRegressor(bootstrap=true, criterion=mse, max_depth=null,\n max_features=0.15000000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=9, min_samples_split=5,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.7000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=13, min_samples_split=7,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev X', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev thermal_conductivity', 'MagpieData avg_dev MeltingT', 'MagpieData range Column', 'MagpieData mean Row', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 2', 'MatscholarElementData std_dev embedding 2', 'MatscholarElementData minimum embedding 41', 'MatscholarElementData range embedding 41', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 42', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData std_dev embedding 60', 'MatscholarElementData minimum embedding 64', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 71', 'MatscholarElementData mean embedding 72', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData maximum embedding 82', 'MatscholarElementData mean embedding 82', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 116', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData std_dev embedding 119', 'MatscholarElementData mean embedding 123', 'MatscholarElementData std_dev embedding 123', 'MatscholarElementData mean embedding 143', 'MatscholarElementData mean embedding 148', 'MatscholarElementData std_dev embedding 149', 'MatscholarElementData mean embedding 163', 'MatscholarElementData minimum embedding 166', 'MatscholarElementData mean embedding 168', 'MatscholarElementData std_dev embedding 170', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 192', 'MatscholarElementData mean embedding 194', 'MatscholarElementData mean embedding 195', 'MatscholarElementData mean embedding 198', 'MatscholarElementData maximum embedding 199', 'DemlData std_dev heat_fusion', 'DemlData minimum boiling_point', 'DemlData mean boiling_point', 'DemlData minimum heat_cap', 'DemlData mean electronegativity', '2-norm', 'transition metal fraction', 'frac p valence electrons', 'frac d valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'HOMO_character_d']}
fold_4 {'best_pipeline': ['(stackingestimator-1, StackingEstimator(estimator=GradientBoostingRegressor(alpha=0.85, criterion=friedman_mse, init=null,\n learning_rate=0.01, loss=lad, max_depth=1,\n max_features=0.45, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=11, min_s...e=0.9500000000000001, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false)))', '(stackingestimator-2, StackingEstimator(estimator=RandomForestRegressor(bootstrap=true, criterion=mse, max_depth=null,\n max_features=0.6000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=9, min_samples_split=16,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false)))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.45, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=3, min_samples_split=4,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev X', 'PymatgenData mean group', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData range mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev thermal_conductivity', 'PymatgenData std_dev melting_point', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MatscholarElementData mean embedding 2', 'MatscholarElementData std_dev embedding 2', 'MatscholarElementData minimum embedding 40', 'MatscholarElementData minimum embedding 41', 'MatscholarElementData range embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 49', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData maximum embedding 58', 'MatscholarElementData minimum embedding 64', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 66', 'MatscholarElementData mean embedding 71', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData maximum embedding 82', 'MatscholarElementData mean embedding 82', 'MatscholarElementData mean embedding 90', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 116', 'MatscholarElementData std_dev embedding 119', 'MatscholarElementData std_dev embedding 123', 'MatscholarElementData std_dev embedding 130', 'MatscholarElementData mean embedding 143', 'MatscholarElementData std_dev embedding 145', 'MatscholarElementData mean embedding 148', 'MatscholarElementData std_dev embedding 149', 'MatscholarElementData mean embedding 163', 'MatscholarElementData std_dev embedding 166', 'MatscholarElementData mean embedding 168', 'MatscholarElementData mean embedding 179', 'MatscholarElementData mean embedding 181', 'MatscholarElementData std_dev embedding 192', 'MatscholarElementData mean embedding 194', 'MatscholarElementData mean embedding 198', 'MatscholarElementData maximum embedding 199', 'DemlData std_dev heat_fusion', 'DemlData mean boiling_point', 'DemlData minimum heat_cap', 'DemlData range heat_cap', 'DemlData mean electronegativity', '2-norm', 'transition metal fraction', 'frac p valence electrons', 'frac d valence electrons', 'density', 'vpa', 'packing fraction', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'HOMO_character_d']}

matbench_mp_is_metal

Fold scores
fold accuracy balanced_accuracy f1 rocauc
fold_0 0.9133 0.9094 0.8982 0.9094
fold_1 0.9123 0.9086 0.8972 0.9086
fold_2 0.9129 0.9089 0.8976 0.9089
fold_3 0.9146 0.9108 0.8998 0.9108
fold_4 0.9129 0.9086 0.8974 0.9086
Fold score stats
metric mean max min std
accuracy 0.9132 0.9146 0.9123 0.0008
balanced_accuracy 0.9093 0.9108 0.9086 0.0008
f1 0.8981 0.8998 0.8972 0.0009
rocauc 0.9093 0.9108 0.9086 0.0008
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(randomforestclassifier, RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null,\n max_features=0.35000000000000003, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=2, min_samples_split=20,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData range X', 'PymatgenData std_dev group', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev mendeleev_no', 'PymatgenData mean thermal_conductivity', 'PymatgenData std_dev thermal_conductivity', 'MagpieData maximum MendeleevNumber', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 2', 'MatscholarElementData mean embedding 4', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 46', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData std_dev embedding 67', 'MatscholarElementData mean embedding 68', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 78', 'MatscholarElementData mean embedding 91', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 99', 'MatscholarElementData std_dev embedding 99', 'MatscholarElementData mean embedding 118', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData std_dev embedding 121', 'MatscholarElementData mean embedding 123', 'MatscholarElementData mean embedding 139', 'MatscholarElementData mean embedding 149', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData mean embedding 198', 'DemlData mean molar_vol', 'DemlData std_dev molar_vol', 'DemlData std_dev melting_point', 'DemlData mean boiling_point', 'DemlData mean first_ioniz', 'DemlData mean electronegativity', 'transition metal fraction', 'frac p valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 9', 'sine coulomb matrix eig 10', 'sine coulomb matrix eig 11']}
fold_1 {'best_pipeline': ['(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(randomforestclassifier, RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null, max_features=0.55,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=4,\n min_samples_split=5, min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData range X', 'PymatgenData std_dev X', 'PymatgenData std_dev group', 'PymatgenData mean block', 'PymatgenData mean mendeleev_no', 'PymatgenData std_dev mendeleev_no', 'PymatgenData mean thermal_conductivity', 'PymatgenData std_dev thermal_conductivity', 'MagpieData maximum MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev Column', 'MagpieData mean Electronegativity', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData avg_dev GSvolume_pa', 'MatscholarElementData mean embedding 4', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 46', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData mean embedding 61', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData mean embedding 68', 'MatscholarElementData mean embedding 72', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData mean embedding 78', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 99', 'MatscholarElementData std_dev embedding 99', 'MatscholarElementData mean embedding 118', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData mean embedding 123', 'MatscholarElementData mean embedding 139', 'MatscholarElementData mean embedding 149', 'MatscholarElementData std_dev embedding 154', 'MatscholarElementData mean embedding 168', 'MatscholarElementData mean embedding 171', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData mean embedding 198', 'MatscholarElementData std_dev embedding 199', 'DemlData mean molar_vol', 'DemlData std_dev molar_vol', 'DemlData mean melting_point', 'DemlData mean boiling_point', 'DemlData mean first_ioniz', 'DemlData mean electronegativity', 'transition metal fraction', 'avg p valence electrons', 'frac p valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 8', 'sine coulomb matrix eig 9', 'sine coulomb matrix eig 10', 'sine coulomb matrix eig 11']}
fold_2 {'best_pipeline': ['(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(randomforestclassifier, RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null, max_features=0.5,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=1,\n min_samples_split=18, min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData mean block', 'PymatgenData std_dev mendeleev_no', 'PymatgenData minimum thermal_conductivity', 'PymatgenData mean thermal_conductivity', 'PymatgenData std_dev thermal_conductivity', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev Column', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData mean NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 4', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 46', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 56', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 67', 'MatscholarElementData mean embedding 68', 'MatscholarElementData mean embedding 72', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 79', 'MatscholarElementData mean embedding 87', 'MatscholarElementData mean embedding 91', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 95', 'MatscholarElementData mean embedding 99', 'MatscholarElementData std_dev embedding 99', 'MatscholarElementData mean embedding 118', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData mean embedding 123', 'MatscholarElementData mean embedding 139', 'MatscholarElementData mean embedding 155', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData mean embedding 198', 'DemlData std_dev molar_vol', 'DemlData mean melting_point', 'DemlData mean boiling_point', 'DemlData mean first_ioniz', 'DemlData mean electronegativity', 'transition metal fraction', 'avg p valence electrons', 'frac p valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 8', 'sine coulomb matrix eig 9', 'sine coulomb matrix eig 10', 'sine coulomb matrix eig 11']}
fold_3 {'best_pipeline': ['(stackingestimator, StackingEstimator(estimator=RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null, max_features=0.5,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=3,\n min_samples_split=8, min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false)))', '(variancethreshold, VarianceThreshold(threshold=0.2))', '(randomforestclassifier, RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null,\n max_features=0.6000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=3, min_samples_split=9,\n min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev group', 'PymatgenData mean block', 'PymatgenData range mendeleev_no', 'PymatgenData mean mendeleev_no', 'PymatgenData minimum thermal_conductivity', 'PymatgenData mean thermal_conductivity', 'PymatgenData std_dev thermal_conductivity', 'MagpieData maximum MendeleevNumber', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 27', 'MatscholarElementData std_dev embedding 32', 'MatscholarElementData mean embedding 40', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 46', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData std_dev embedding 67', 'MatscholarElementData mean embedding 68', 'MatscholarElementData std_dev embedding 68', 'MatscholarElementData mean embedding 72', 'MatscholarElementData std_dev embedding 72', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 78', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 95', 'MatscholarElementData mean embedding 99', 'MatscholarElementData std_dev embedding 99', 'MatscholarElementData mean embedding 107', 'MatscholarElementData mean embedding 115', 'MatscholarElementData mean embedding 118', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData mean embedding 128', 'MatscholarElementData mean embedding 139', 'MatscholarElementData mean embedding 171', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData mean embedding 198', 'DemlData std_dev molar_vol', 'DemlData mean boiling_point', 'DemlData mean first_ioniz', 'DemlData std_dev first_ioniz', 'DemlData mean electronegativity', 'transition metal fraction', 'frac p valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 9', 'sine coulomb matrix eig 10', 'sine coulomb matrix eig 11']}
fold_4 {'best_pipeline': ['(featureunion, FeatureUnion(n_jobs=null,\n transformer_list=[(functiontransformer, FunctionTransformer(accept_sparse=false, check_inverse=true,\n func=<function copy at 0x2b097cae9d08>, inv_kw_args=null,\n inverse_func=null, kw_args=null, pass_y=deprecated,\n validate=null)), (standardscaler, StandardScaler(copy=true, with_mean=true, with_std=true))],\n transformer_weights=null))', '(randomforestclassifier, RandomForestClassifier(bootstrap=false, class_weight=null,\n criterion=entropy, max_depth=null, max_features=0.2,\n max_leaf_nodes=null, min_impurity_decrease=0.0,\n min_impurity_split=null, min_samples_leaf=6,\n min_samples_split=9, min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null, oob_score=false,\n random_state=null, verbose=0, warm_start=false))'], 'features_reduced': ['HOMO_energy', 'gap_AO', 'PymatgenData std_dev X', 'PymatgenData mean block', 'PymatgenData maximum mendeleev_no', 'PymatgenData std_dev mendeleev_no', 'PymatgenData mean thermal_conductivity', 'PymatgenData std_dev thermal_conductivity', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MatscholarElementData mean embedding 41', 'MatscholarElementData std_dev embedding 41', 'MatscholarElementData mean embedding 46', 'MatscholarElementData mean embedding 51', 'MatscholarElementData minimum embedding 52', 'MatscholarElementData mean embedding 52', 'MatscholarElementData mean embedding 61', 'MatscholarElementData mean embedding 64', 'MatscholarElementData mean embedding 65', 'MatscholarElementData mean embedding 68', 'MatscholarElementData mean embedding 72', 'MatscholarElementData std_dev embedding 73', 'MatscholarElementData mean embedding 78', 'MatscholarElementData std_dev embedding 78', 'MatscholarElementData mean embedding 93', 'MatscholarElementData mean embedding 95', 'MatscholarElementData mean embedding 99', 'MatscholarElementData std_dev embedding 99', 'MatscholarElementData mean embedding 107', 'MatscholarElementData mean embedding 115', 'MatscholarElementData mean embedding 118', 'MatscholarElementData std_dev embedding 118', 'MatscholarElementData mean embedding 123', 'MatscholarElementData mean embedding 139', 'MatscholarElementData mean embedding 179', 'MatscholarElementData std_dev embedding 179', 'MatscholarElementData mean embedding 188', 'MatscholarElementData mean embedding 198', 'MatscholarElementData std_dev embedding 199', 'DemlData std_dev molar_vol', 'DemlData mean heat_fusion', 'DemlData mean boiling_point', 'DemlData mean first_ioniz', 'DemlData mean electronegativity', 'transition metal fraction', 'avg p valence electrons', 'frac p valence electrons', 'density', 'vpa', 'packing fraction', 'spacegroup_num', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5', 'sine coulomb matrix eig 6', 'sine coulomb matrix eig 7', 'sine coulomb matrix eig 8', 'sine coulomb matrix eig 9', 'sine coulomb matrix eig 10', 'sine coulomb matrix eig 11']}

matbench_perovskites

Fold scores
fold mae rmse mape* max_error
fold_0 0.2159 0.3114 0.2077 2.7651
fold_1 0.1904 0.2857 0.1944 2.6783
fold_2 0.1962 0.2869 0.1933 2.4466
fold_3 0.1992 0.2907 0.2209 3.3116
fold_4 0.2006 0.3023 0.1886 2.4386
Fold score stats
metric mean max min std
mae 0.2005 0.2159 0.1904 0.0085
rmse 0.2954 0.3114 0.2857 0.0099
mape* 0.2010 0.2209 0.1886 0.0118
max_error 2.7280 3.3116 2.4386 0.3186
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.45000000000000007, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=20,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData maximum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'structural complexity per cell']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(zerocount, ZeroCount())', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.5500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'structural complexity per atom']}
fold_2 {'best_pipeline': ['(selectfwe, SelectFwe(alpha=0.03, score_func=<function f_regression at 0x2aaaf35a08c8>))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.75, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=huber, max_depth=9,\n max_features=0.6500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=10, min_samples_split=11,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.6500000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'structural complexity per atom']}
fold_3 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.95, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=ls, max_depth=7,\n max_features=0.8, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=19, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.3, tol=0.0001,\n validation_fraction=0.1, verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'structural complexity per cell']}
fold_4 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.05))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(randomforestregressor, RandomForestRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.6500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=8,\n min_weight_fraction_leaf=0.0, n_estimators=100,\n n_jobs=null, oob_score=false, random_state=null,\n verbose=0, warm_start=false))'], 'features_reduced': ['MagpieData range Number', 'MagpieData minimum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData mean NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NValence', 'MagpieData mean NdUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'std_dev oxidation state', 'avg anion electron affinity', 'avg ionic char', 'density', 'vpa', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 1', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3', 'sine coulomb matrix eig 4', 'structural complexity per cell']}

matbench_phonons

Fold scores
fold mae rmse mape* max_error
fold_0 67.5727 146.7970 0.1079 1151.5570
fold_1 54.0755 100.2097 0.1048 890.4159
fold_2 50.9853 96.5991 0.0931 680.9361
fold_3 59.6458 127.8555 0.1142 926.0969
fold_4 48.5738 77.0626 0.0958 383.1912
Fold score stats
metric mean max min std
mae 56.1706 67.5727 48.5738 6.7981
rmse 109.7048 146.7970 77.0626 24.6280
mape* 0.1032 0.1142 0.0931 0.0078
max_error 806.4394 1151.5570 383.1912 258.9850
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.01))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.7500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum Number', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData minimum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData mean NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'minimum oxidation state', 'avg anion electron affinity', 'max ionic char', 'density', 'spacegroup_num', 'ewald_energy', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 4']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.005))', '(maxabsscaler, MaxAbsScaler(copy=true))', '(gradientboostingregressor, GradientBoostingRegressor(alpha=0.8, criterion=friedman_mse, init=null,\n learning_rate=0.1, loss=ls, max_depth=9,\n max_features=0.7000000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=17,\n min_weight_fraction_leaf=0.0, n_estimators=500,\n n_iter_no_change=null, presort=auto,\n random_state=null, subsample=0.9000000000000001,\n tol=0.0001, validation_fraction=0.1, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum Number', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData minimum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData mean NpUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData mean GSbandgap', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'avg anion electron affinity', 'density', 'vpa', 'spacegroup_num', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3']}
fold_2 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.8500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum Number', 'MagpieData mode Number', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData minimum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData mode GSbandgap', 'MagpieData avg_dev SpaceGroupNumber', 'minimum oxidation state', 'avg anion electron affinity', 'max ionic char', 'density', 'packing fraction', 'ewald_energy', 'sine coulomb matrix eig 2', 'structural complexity per cell']}
fold_3 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.0001))', '(onehotencoder, OneHotEncoder(categorical_features=[false, false, false, false, false, false,\n false, false, false, false, false, false,\n false, false, false, false, true, false,\n false, true, false, false, false, false,\n false, false, false, true, false, false, ...],\n dtype=<class float>, minimum_fraction=0.05, sparse=false,\n threshold=10))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.6500000000000001, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=1000,\n n_jobs=null, oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum Number', 'MagpieData mode Number', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData minimum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev NValence', 'MagpieData maximum NpUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData range NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData mode GSbandgap', 'avg anion electron affinity', 'max ionic char', 'density', 'crystal_system_int', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 4', 'sine coulomb matrix eig 5']}
fold_4 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.2))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.5500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData minimum Number', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData range Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData minimum CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NValence', 'MagpieData maximum NpUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData mean NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev SpaceGroupNumber', 'std_dev oxidation state', 'avg anion electron affinity', 'max ionic char', 'density', 'spacegroup_num', 'ewald_energy', 'sine coulomb matrix eig 0', 'sine coulomb matrix eig 2', 'sine coulomb matrix eig 3']}

matbench_steels

Fold scores
fold mae rmse mape* max_error
fold_0 109.3058 188.8049 0.0693 1082.7703
fold_1 80.4188 109.2771 0.0569 416.3620
fold_2 83.5360 120.2935 0.0607 424.5913
fold_3 98.7186 136.5898 0.0722 473.4563
fold_4 115.4851 215.1149 0.0891 1142.9223
Fold score stats
metric mean max min std
mae 97.4929 115.4851 80.4188 13.7919
rmse 154.0161 215.1149 109.2771 40.9531
mape* 0.0696 0.0891 0.0569 0.0112
max_error 708.0205 1142.9223 416.3620 331.6607
Fold parameters
fold params dict
fold_0 {'best_pipeline': ['(selectfrommodel, SelectFromModel(estimator=ExtraTreesRegressor(bootstrap=false, criterion=mse,\n max_depth=null,\n max_features=0.9500000000000002,\n max_leaf_nodes=null,\n min_impurity_decrease=0.0,\n min_impurity_split=null,\n min_samples_leaf=1,\n min_samples_split=2,\n min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null,\n oob_score=false,\n random_state=null, verbose=0,\n warm_start=false),\n max_features=null, norm_order=1, prefit=false, threshold=0.05))', '(robustscaler, RobustScaler(copy=true, quantile_range=(25.0, 75.0), with_centering=true,\n with_scaling=true))', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.9500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData avg_dev AtomicWeight', 'MagpieData mean MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_amor']}
fold_1 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(fastica, FastICA(algorithm=parallel, fun=logcosh, fun_args=null, max_iter=200,\n n_components=null, random_state=null, tol=0.7000000000000001,\n w_init=null, whiten=true))', '(kneighborsregressor, KNeighborsRegressor(algorithm=auto, leaf_size=30, metric=minkowski,\n metric_params=null, n_jobs=null, n_neighbors=4, p=1,\n weights=distance))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mean MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_amor']}
fold_2 {'best_pipeline': ['(selectpercentile, SelectPercentile(percentile=53,\n score_func=<function f_regression at 0x2aaaf79a38c8>))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(kneighborsregressor, KNeighborsRegressor(algorithm=auto, leaf_size=30, metric=minkowski,\n metric_params=null, n_jobs=null, n_neighbors=4, p=2,\n weights=distance))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter']}
fold_3 {'best_pipeline': ['(variancethreshold, VarianceThreshold(threshold=0.1))', '(minmaxscaler, MinMaxScaler(copy=true, feature_range=(0, 1)))', '(kneighborsregressor, KNeighborsRegressor(algorithm=auto, leaf_size=30, metric=minkowski,\n metric_params=null, n_jobs=null, n_neighbors=4, p=2,\n weights=distance))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mean MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NsUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter']}
fold_4 {'best_pipeline': ['(selectfrommodel, SelectFromModel(estimator=ExtraTreesRegressor(bootstrap=false, criterion=mse,\n max_depth=null,\n max_features=0.9500000000000002,\n max_leaf_nodes=null,\n min_impurity_decrease=0.0,\n min_impurity_split=null,\n min_samples_leaf=1,\n min_samples_split=2,\n min_weight_fraction_leaf=0.0,\n n_estimators=100, n_jobs=null,\n oob_score=false,\n random_state=null, verbose=0,\n warm_start=false),\n max_features=null, norm_order=1, prefit=false, threshold=0.05))', '(zerocount, ZeroCount())', '(extratreesregressor, ExtraTreesRegressor(bootstrap=false, criterion=mse, max_depth=null,\n max_features=0.7500000000000002, max_leaf_nodes=null,\n min_impurity_decrease=0.0, min_impurity_split=null,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=null,\n oob_score=false, random_state=null, verbose=0,\n warm_start=false))'], 'features_reduced': ['MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mean MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData avg_dev NpValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev NValence', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev GSbandgap', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'Yang omega', 'Yang delta', 'Miedema_deltaH_inter']}