automatminer.automl package¶
Subpackages¶
Submodules¶
automatminer.automl.adaptors module¶
Adaptor classes for using AutoML packages in a Matbench pipeline.
Current adaptor classes are:
- TPOTAdaptor: Uses the backend from the automl project TPOT, which can be
found at https://github.com/EpistasisLab/tpot
-
class
automatminer.automl.adaptors.
SinglePipelineAdaptor
(regressor, classifier)¶ Bases:
automatminer.automl.base.DFMLAdaptor
For running single models or pipelines in a MatPipe pipeline using the same syntax as the AutoML adaptors.
This adaptor should be able to fit into a MatPipe in similar fashion to TPOTAdaptor.
- Parameters
regressor (sklearn Pipeline or BaseEstimator-like) – The object you want to use for machine learning regression. Must implement fit/predict/transform methods analagously to BaseEstimator, but does not need to be a BaseEstimator or Pipeline.
classifier (sklearn Pipeline or BaseEstimator-like) – The object you want to use for machine learning classification.
-
The following unique attributes are set during fitting.
-
property
backend
¶
-
property
best_pipeline
¶
-
property
features
¶
-
fit
(**kwargs)¶ Wrapper for a method to log.
- Parameters
operation (str) – The operation to be logging.
- Returns
The method result.
- Return type
result
-
property
fitted_target
¶
-
class
automatminer.automl.adaptors.
TPOTAdaptor
(**tpot_kwargs)¶ Bases:
automatminer.automl.base.DFMLAdaptor
A dataframe adaptor for the TPOT classifiers and regressors.
- Parameters
tpot_kwargs –
All kwargs accepted by a TPOTRegressor/TPOTClassifier or TPOTBase object.
Note that for example, you can limit the models that TPOT explores by setting config_dict directly. For example, if you want to only use random forest:
= { (config_dict) –
- ‘sklearn.ensemble.RandomForestRegressor’: {
‘n_estimators’: [100], ‘max_features’: np.arange(0.05, 1.01, 0.05), ‘min_samples_split’: range(2, 21), ‘min_samples_leaf’: range(1, 21), ‘bootstrap’: [True, False] },
}
-
The following unique attributes are set during fitting.
-
best_models
¶ The best model names and their scores.
- Type
OrderedDict
-
backend
¶ The TPOT object interface used for ML training.
- Type
TPOTBase
-
models
¶ The raw sklearn-style models output by TPOT.
- Type
OrderedDict
-
from_serialized
¶ Whether the backend is loaded from a serialized instance. If True, the previous full TPOT data will not be available due to pickling problems.
- Type
-
property
backend
-
property
best_models
-
property
best_pipeline
¶
-
deserialize
(**kwargs)¶
-
property
features
¶
-
fit
(**kwargs)¶ Wrapper for a method to log.
- Parameters
operation (str) – The operation to be logging.
- Returns
The method result.
- Return type
result
-
property
fitted_target
¶
-
serialize
(**kwargs)¶
automatminer.automl.base module¶
Base classes for automl.
-
class
automatminer.automl.base.
DFMLAdaptor
¶ Bases:
automatminer.base.DFTransformer
A base class to adapt from an AutoML backend to a sklearn-style fit/predict scheme and add a few extensions for pandas dataframes.
When implementing a base class adaptor, make sure to use @check_fitted and @set_fitted if necessary!
-
abstract property
backend
¶ The AutoML backend object. Does not need to implement any methods for compatibility with higher level classes. If no AutoML backend is present e.g., SinglePipelineAdaptor, backend = None.
Does not need to be serializable, as matpipe.save will not save backends.
-
abstract property
best_pipeline
¶ The best ML pipeline found by the backend. Can be any type though BaseEstimator is preferred.
1. MUST implement a .predict method unless DFMLAdaptor.predict is overridden!
MUST be serializable!
Should be as close to the algorithm as possible - i.e., instead of calling TPOTClassifier.fit, calls TPOTClassifier.fitted_pipeline_, so that examining the true form of models is more straightforward.
-
deserialize
(**kwargs)¶
-
abstract property
features
¶ The features being used for machine learning.
- Returns
The feature labels
- Return type
([str])
-
abstract property
fitted_target
¶ The target (a string) on which the adaptor was fit on. :returns: The fitted target label. :rtype: (str)
-
predict
(**kwargs)¶
-
serialize
(**kwargs)¶
-
abstract property