matminer.featurizers.utils package

Subpackages

Submodules

matminer.featurizers.utils.grdf module

Functions designed to work with General Radial Distribution Function

class matminer.featurizers.utils.grdf.AbstractPairwise

Bases: object

Abstract class for pairwise functions used in Generalized Radial Distribution Function

name()

Make a label for this pairwise function

Returns:

(string) Label for the function

volume(cutoff)

Compute the volume of this pairwise function

Args:

cutoff (float): Cutoff distance for radial distribution function

Returns:

(float): Volume of bin

class matminer.featurizers.utils.grdf.Bessel(n)

Bases: AbstractPairwise

Bessel pairwise function

__init__(n)

Initialize the function

Args:

n (int): Degree of Bessel function

class matminer.featurizers.utils.grdf.Cosine(a)

Bases: AbstractPairwise

Cosine pairwise function: cos(ar)

__init__(a)

Initialize the function

Args:

a (float): Frequency factor for cosine function

volume(cutoff)

Compute the volume of this pairwise function

Args:

cutoff (float): Cutoff distance for radial distribution function

Returns:

(float): Volume of bin

class matminer.featurizers.utils.grdf.Gaussian(width, center)

Bases: AbstractPairwise

Gaussian function, with specified width and center

__init__(width, center)

Initialize the gaussian function

Args:

width (float): Width of the gaussian center (float): Center of the gaussian

volume(cutoff)

Compute the volume of this pairwise function

Args:

cutoff (float): Cutoff distance for radial distribution function

Returns:

(float): Volume of bin

class matminer.featurizers.utils.grdf.Histogram(start, width)

Bases: AbstractPairwise

Rectangular window function, used in conventional Radial Distribution Functions

__init__(start, width)

Initialize the window function

Args:

start (float): Beginning of window width (float): Size of window

volume(cutoff)

Compute the volume of this pairwise function

Args:

cutoff (float): Cutoff distance for radial distribution function

Returns:

(float): Volume of bin

class matminer.featurizers.utils.grdf.Sine(a)

Bases: AbstractPairwise

Sine pairwise function: sin(ar)

__init__(a)

Initialize the function

Args:

a (float): Frequency factor for sine function

volume(cutoff)

Compute the volume of this pairwise function

Args:

cutoff (float): Cutoff distance for radial distribution function

Returns:

(float): Volume of bin

matminer.featurizers.utils.grdf.initialize_pairwise_function(name, **options)

Create a new pairwise function object

Args:

name (string): Name of class to instantiate

Keyword Arguments:

Any options for the pairwise class (see each pairwise function for details)

matminer.featurizers.utils.oxidation module

matminer.featurizers.utils.oxidation.has_oxidation_states(comp)

Check if a composition object has oxidation states for each element

Args:

comp (Composition): Composition to check

Returns:

(boolean) Whether this composition object contains oxidation states

matminer.featurizers.utils.stats module

General methods for computing property statistics from a list of values

class matminer.featurizers.utils.stats.PropertyStats

Bases: object

This class contains statistical operations that are commonly employed when computing features.

The primary way for interacting with this class is to call the calc_stat function, which takes the name of the statistic you would like to compute and the weights/values of data to be assessed. For example, computing the mean of a list looks like:

x = [1, 2, 3]
PropertyStats.calc_stat(x, 'mean') # Result is 2
PropertyStats.calc_stat(x, 'mean', weights=[0, 0, 1]) # Result is 3

Some of the statistics functions take options (e.g., Holder means). You can pass them to the statistics functions by adding them after the name and two colons. For example, the 0th Holder mean would be:

PropertyStats.calc_stat(x, 'holder_mean::0')

You can, of course, call the statistical functions directly. All take at least two arguments. The first is the data being assessed and the second, optional, argument is the weights.

static avg_dev(data_lst, weights=None)

Mean absolute deviation of list of element data.

This is computed by first calculating the mean of the list, and then computing the average absolute difference between each value and the mean.

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

mean absolute deviation

static calc_stat(data_lst, stat, weights=None)

Compute a property statistic

Args:

data_lst (list of floats): list of values stat (str) - Name of property to be compute. If there are arguments to the statistics function, these

should be added after the name and separated by two colons. For example, the 2nd Holder mean would be “holder_mean::2”

weights (list of floats): (Optional) weights for each element in data_lst

Returns:

float - Desired statistic

static eigenvalues(data_lst, symm=False, sort=False)

Return the eigenvalues of a matrix as a numpy array Args:

data_lst: (matrix-like) of values symm: whether to assume the matrix is symmetric sort: whether to sort the eigenvalues

Returns: eigenvalues

static flatten(data_lst, weights=None)

Returns a flattened copy of data_lst-as a numpy array

static geom_std_dev(data_lst, weights=None)

Geometric standard deviation

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

geometric standard deviation

static holder_mean(data_lst, weights=None, power=1)

Get Holder mean Args:

data_lst: (list/array) of values weights: (list/array) of weights power: (int/float/str) which holder mean to compute

Returns: Holder mean

static inverse_mean(data_lst, weights=None)

Mean of the inverse of each entry

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

inverse mean

static kurtosis(data_lst, weights=None)

Kurtosis of a list of data

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

kurtosis

static maximum(data_lst, weights=None)

Maximum value in a list

Args:

data_lst (list of floats): List of values to be assessed weights: (ignored)

Returns:

maximum value

static mean(data_lst, weights=None)

Arithmetic mean of list

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

mean value

static minimum(data_lst, weights=None)

Minimum value in a list

Args:

data_lst (list of floats): List of values to be assessed weights: (ignored)

Returns:

minimum value

static mode(data_lst, weights=None)

Mode of a list of data.

If multiple elements occur equally-frequently (or same weight, if weights are provided), this function will return the minimum of those values.

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

mode

static quantile(data_lst, weights=None, q=0.5)

Return a specific quantile. Args:

data_lst (list or np.ndarray): 1D data list to be used for computing

quantiles

q (float): The quantile, as a fraction between 0 and 1.

Returns:

(float) The computed quantile of the data_lst.

static range(data_lst, weights=None)

Range of a list

Args:

data_lst (list of floats): List of values to be assessed weights: (ignored)

Returns:

range

static skewness(data_lst, weights=None)

Skewness of a list of data

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

shewness

static sorted(data_lst, weights=None)

Returns the sorted data_lst

static std_dev(data_lst, weights=None)

Standard deviation of a list of element data

Args:

data_lst (list of floats): List of values to be assessed weights (list of floats): Weights for each value

Returns:

standard deviation

Module contents