automatminer.utils package

Submodules

automatminer.utils.log module

Utils for logging.

automatminer.utils.log.initialize_logger(logger_name, log_dir='.', level=None)

Initialize the default logger with stdout and file handlers.

Parameters
  • logger_name (str) – The package name.

  • log_dir (str) – Path to the folder where the log file will be written.

  • level (int) – The log level. For example logging.DEBUG.

Returns

A logging instance with customized formatter and handlers.

Return type

(Logger)

automatminer.utils.log.initialize_null_logger(name)

Initialize the a dummy logger which will swallow all logging commands. :returns: The package name.

(Logger): A dummy logging instance with no output.

Return type

(Logger)

automatminer.utils.log.log_progress(logger, operation)

Decorator to auto-log progress before and after executing a method, such as fit and transform. Should only be applied to DataFrameTransformers.

For example,

INFO: Beginning AutoFeaturizer fitting. … autofeaturizer logs … INFO: Finished AutoFeaturizer fitting.

Parameters
  • logger (logging.Logger) – A logger object to help log progress.

  • operation (str) – Some info about the operation you want to log.

Returns

A wrapper for the input method.

automatminer.utils.ml module

Tools and utils for machine learning.

automatminer.utils.ml.is_greater_better(scoring_function)

Determines whether scoring_function being greater is more favorable/better. :param scoring_function: the name of the scoring function supported by

TPOT and sklearn. Please see below for more information.

Returns (bool): Whether the scoring metric should be considered better if

it is larger or better if it is smaller

Return type

bool

automatminer.utils.ml.regression_or_classification(series)

Determine if a series (target column) is numeric or categorical, to decide on the problem as regression or classification.

Parameters

series (pandas.Series) – The target column.

Returns

“regression” or “classification”

Return type

(str)

automatminer.utils.pkg module

Utils specific to this package.

exception automatminer.utils.pkg.AutomatminerError(msg)

Bases: BaseException

Exception specific to automatminer methods.

exception automatminer.utils.pkg.VersionError(msg)

Bases: automatminer.utils.pkg.AutomatminerError

Version errors

automatminer.utils.pkg.check_fitted(func)

Decorator to check if a transformer has been fitted. :param func: A function or method.

Returns

A wrapper function for the input function/method.

automatminer.utils.pkg.compare_columns(df1, df2, ignore=None)

Compare the columns of a dataframe.

Parameters
  • df1 (pandas.DataFrame) – The first dataframe.

  • df2 (pandas.DataFrame) – The second dataframe.

  • ignore ([str]) – The feature labels to ignore in the analyis.

Returns

{“df1_not_in_df2”: [The columns in df1 not in df2],

”df2_not_in_df1”: [The columns in df2 not in df1], “mismatch”: (bool)}

Return type

(dict)

automatminer.utils.pkg.get_version()

Get the version of automatminer without worrying about circular imports in __init__.

Returns

the version

Return type

(str)

automatminer.utils.pkg.return_attrs_recursively(obj)

Returns attributes of an object recursively. Stops recursion when attrs go outside of the automatminer library.

Parameters

obj (object) – The object with attrs

Returns

The dictionary containing attributes which can

be pretty-printed.

Return type

attrdict (dict)

automatminer.utils.pkg.save_dict_to_file(d, filename)

Save a dictionary to a persistent file. Supported formats and extensions are text (‘.txt’), JSON (‘.json’), and YAML (‘.yaml’, ‘.yml’).

If no extension is provided, text format will be used.

Parameters
  • d (dict) – A dictionary of strings or objects castable to python native objects (e.g., NumPy integers).

  • filename (str) – The filename and extension to save the file. For example, “mydict.json”.

Return type

None

Returns

None

automatminer.utils.pkg.set_fitted(func)

Decorator to ensure a transformer is fitted properly. :param func: A function or method.

Returns

A wrapper function for the input function/method.

Module contents