matminer.featurizers.composition package¶
Subpackages¶
- matminer.featurizers.composition.tests package
- Submodules
- matminer.featurizers.composition.tests.base module
- matminer.featurizers.composition.tests.test_alloy module
- matminer.featurizers.composition.tests.test_composite module
CompositeFeaturesTest
CompositeFeaturesTest.test_elem()
CompositeFeaturesTest.test_elem_deml()
CompositeFeaturesTest.test_elem_matminer()
CompositeFeaturesTest.test_elem_matscholar_el()
CompositeFeaturesTest.test_elem_megnet_el()
CompositeFeaturesTest.test_elem_optical()
CompositeFeaturesTest.test_elem_transport()
CompositeFeaturesTest.test_fere_corr()
CompositeFeaturesTest.test_meredig()
- matminer.featurizers.composition.tests.test_element module
- matminer.featurizers.composition.tests.test_ion module
- matminer.featurizers.composition.tests.test_orbital module
- matminer.featurizers.composition.tests.test_packing module
- matminer.featurizers.composition.tests.test_thermo module
- Module contents
Submodules¶
matminer.featurizers.composition.alloy module¶
Composition featurizers specialized for use with alloys.
- class matminer.featurizers.composition.alloy.Miedema(struct_types='all', ss_types='min', data_source='Miedema', impute_nan=False)¶
Bases:
BaseFeaturizer
Formation enthalpies of intermetallic compounds, from Miedema et al.
Calculate the formation enthalpies of the intermetallic compound, solid solution and amorphous phase of a given composition, based on semi-empirical Miedema model (and some extensions), particularly for transitional metal alloys.
Support elemental, binary and multicomponent alloys. For elemental/binary alloys, the formulation is based on the original works by Miedema et al. in 1980s; For multicomponent alloys, the formulation is basically the linear combination of sub-binary systems. This is reported to work well for ternary alloys, but needs to be careful with quaternary alloys and more.
- Args:
- struct_types (str or [str]): default=’all’
‘inter’: intermetallic compound; ‘ss’: solid solution ‘amor’: amorphous phase; ‘all’: same for [‘inter’, ‘ss’, ‘amor’] [‘inter’, ‘ss’]: amorphous phase and solid solution
- ss_types (str or [str]): only for ss, default=’min’
‘fcc’: fcc solid solution; ‘bcc’: bcc solid solution ‘hcp’: hcp solid solution; ‘no_latt’: solid solution with no specific structure type ‘min’: min value of [‘fcc’, ‘bcc’, ‘hcp’, ‘no_latt’] ‘all’: same for [‘fcc’, ‘bcc’, ‘hcp’, ‘no_latt’] [‘fcc’, ‘bcc’]: fcc and bcc solid solutions
- data_source (str): source of dataset, default=’Miedema’
‘Miedema’: ‘Miedema.csv’ placed in “matminer/utils/data_files/”, containing the following model parameters for 73 elements: ‘molar_volume’, ‘electron_density’, ‘electronegativity’ ‘valence_electrons’, ‘a_const’, ‘R_const’, ‘H_trans’ ‘compressibility’, ‘shear_modulus’, ‘melting_point’ ‘structural_stability’. Please see the references for details.
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- Returns:
- (list of floats) Miedema formation enthalpies (eV/atom) for input
struct_types: -Miedema_deltaH_inter: for intermetallic compound -Miedema_deltaH_ss: for solid solution, can include ‘fcc’, ‘bcc’,
‘hcp’, ‘no_latt’, ‘min’ based on input ss_types
-Miedema_deltaH_amor: for amorphous phase
- __init__(struct_types='all', ss_types='min', data_source='Miedema', impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- deltaH_chem(elements, fracs, struct)¶
Chemical term of formation enthalpy Args:
elements (list of str): list of elements fracs (list of floats): list of atomic fractions struct (str): ‘inter’, ‘ss’ or ‘amor’
- Returns:
deltaH_chem (float): chemical term of formation enthalpy
- deltaH_elast(elements, fracs)¶
Elastic term of formation enthalpy Args:
elements (list of str): list of elements fracs (list of floats): list of atomic fractions
- Returns:
deltaH_elastic (float): elastic term of formation enthalpy
- deltaH_struct(elements, fracs, latt)¶
Structural term of formation enthalpy, only for solid solution Args:
elements (list of str): list of elements fracs (list of floats): list of atomic fractions latt (str): ‘fcc’, ‘bcc’, ‘hcp’ or ‘no_latt’
- Returns:
deltaH_struct (float): structural term of formation enthalpy
- deltaH_topo(elements, fracs)¶
Topological term of formation enthalpy, only for amorphous phase Args:
elements (list of str): list of elements fracs (list of floats): list of atomic fractions
- Returns:
deltaH_topo (float): topological term of formation enthalpy
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get Miedema formation enthalpies of target structures: inter, amor, ss (can be further divided into ‘min’, ‘fcc’, ‘bcc’, ‘hcp’, ‘no_latt’
for different lattice_types)
- Args:
comp: Pymatgen composition object
- Returns:
miedema (list of floats): formation enthalpies of target structures
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- precheck(c: Composition) bool ¶
Precheck a single entry. Miedema does not work for compositions containing any elements for which the Miedema model has no parameters. To precheck an entire dataframe (and automatically gather the fraction of structures that will pass the precheck), please use precheck_dataframe.
- Args:
c (pymatgen.Composition): The composition to precheck.
- Returns:
(bool): If True, s passed the precheck; otherwise, it failed.
- class matminer.featurizers.composition.alloy.WenAlloys(impute_nan=False)¶
Bases:
BaseFeaturizer
Calculate features for alloy properties.
Based on the work:
“Machine learning assisted design of high entropy alloys with desired property” by Wen et al., Acta Materiala 170, 109-117 (2019).
Copyright 2020 Battelle Energy Alliance, LLC ALL RIGHTS RESERVED
- Features:
Yang omega Yang delta Radii local mismatch Radii gamma Configuration entropy Lambda entropy Electronegativity delta Electronegativity local mismatch VEC mean Mixing enthalpy Mean cohesive energy Interant electrons Shear modulus mean Shear modulus delta Shear modulus local mismatch Shear modulus strength model
Copyright 2020 Battelle Energy Alliance, LLC ALL RIGHTS RESERVED
- Args:
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- __init__(impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- static compute_atomic_fraction(elements, composition)¶
Get atomic fraction string.
- Args:
elements ([pymatgen.Element or str]): List of elements composition (pymatgen.Composition): Composition
- Returns:
(str)
- static compute_configuration_entropy(fractions)¶
Compute the configuration entropy.
R \sum^n_{i=1} c_i \ln{c_i}
where c_i are the fraction of each element i and R is the ideal gas constant Args:
fractions ([float]): List of element fractions
- Returns:
(float) gamma
- static compute_delta(variable, fractions)¶
Compute Yang’s delta parameter for a generic variable.
\sqrt{\sum^n_{i=1} c_i \left( 1 - \frac{v_i}{\bar{v}} \right)^2 }
where c_i and v_i are the fraction and variable of element i, and \bar{v} is the fraction-weighted average of the variable. Args:
variable (list): List of properties to assess fractions (list): List of fractions to assess
- Returns:
(float) delta
- compute_enthalpy(elements, fractions)¶
Compute mixing enthalpy.
- Args:
elements ([pymatgen.Element or str]): List of elements fractions [float]: Fractions of elements in composition
- Returns:
(float) H_mixing
- static compute_gamma_radii(miracle_radius_stats)¶
- Compute Gamma of the radii. The solid angles of the
atomic packing for the elements with the most significant and smallest atomic sizes.
:math:`
rac{1 - sqrt{ rac{((r + r_{min})^2 - r^2)}{(r + r_{min})^2}}}{1 - sqrt{ rac{((r + r_{max})^2 - r^2)}{(r + r_{max})^2}}}`
where r, r_{min} and r_{max} are the mean radii min radii and max radii.
- Args:
miracle_radius_stats (dict): Dictionary of stats for miracleradius via compute_magpie_summary
- Returns:
(float) gamma
- static compute_lambda(yang_delta, entropy)¶
- Args:
yang_delta (float): Yang Solid Solution Delta entropy (float): Configuration entropy
- Returns:
float
- static compute_local_mismatch(variable, fractions)¶
Compute local mismatch of a given variable.
:math:`sum^n_{i=1} sum^n_{j=1,i
eq j} c_i c_j | v_i - v_j |^2`
where c_{i,j} and v_{i,j} are the fraction and variable of element i,j. Args:
variable (list): List of properties to asses fractions (list): List of fractions to asses
- Returns:
(float) local mismatch
- compute_magpie_summary(attribute_name, elements, fractions)¶
Get limited list of weighted statistics according to magpie data.
- Args:
attribute_name (str): Name of magpie attribute to retrieve elements ([pymatgen.element or str]): List of elements fractions ([float]): List of element fractions
- Returns:
(dict) Dictionary of element-fraction weighted statistics for attribute.
- static compute_strength_local_mismatch_shear(shear_modulus, mean_shear_modulus, fractions)¶
The local mismatch of the shear values.
:math:`sum^n_{i=1}
rac{c_i rac{2(G_i - G)}{G_i + G} }{left(1 + 0.5 |c_i rac{2(G_i - G)}{G_i + G} ight)|}`
where c_{i}, :math:’G’ and G_{i} are the fraction, mean shear modulus and shear modulus of element i. Args:
shear_modulus ([float]): List of shear moduli of elements mean_shear_modulus(float): Mean of shear moduli fractions ([float]): List of element fractions in the composition
- Returns:
(float) strengthening local mismatch
- static compute_weight_fraction(elements, composition)¶
Get weight fraction string.
- Args:
elements ([pymatgen.Element or str]): List of elements composition (pymatgen.Composition): Composition
- Returns:
(str)
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get elemental property attributes Args:
comp: Pymatgen composition object
- Returns:
(list): Generated Wen et al. features.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- precheck(comp)¶
Precheck (provide an estimate of whether a featurizer will work or not) for a single entry (e.g., a single composition). If the entry fails the precheck, it will most likely fail featurization; if it passes, it is likely (but not guaranteed) to featurize correctly.
- Prechecks should be:
accurate (but can be good estimates rather than ground truth)
fast to evaluate
- unlikely to be obsolete via changes in the featurizer in the near
future
This method should be overridden by any featurizer requiring its use, as by default all entries will pass prechecking. Also, precheck is a good opportunity to throw warnings about long runtimes (e.g., doing nearest neighbors computations on a structure with many thousand sites).
See the documentation for precheck_dataframe for more information.
- Args:
- *x (Composition, Structure, etc.): Input to-be-featurized. Can be
a single input or multiple inputs.
- Returns:
(bool): True, if passes the precheck. False, if fails.
- class matminer.featurizers.composition.alloy.YangSolidSolution(impute_nan=False)¶
Bases:
BaseFeaturizer
Mixing thermochemistry and size mismatch terms of Yang and Zhang (2012)
This featurizer returns two different features developed by .. Yang and Zhang https://linkinghub.elsevier.com/retrieve/pii/S0254058411009357 to predict whether metal alloys will form metallic glasses, crystalline solid solutions, or intermetallics. The first, Omega, is related to the balance between the mixing entropy and mixing enthalpy of the liquid phase. The second, delta, is related to the atomic size mismatch between the different elements of the material.
- Features
Yang omega - Mixing thermochemistry feature, Omega Yang delta - Atomic size mismatch term
- Args:
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- References:
- __init__(impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- compute_delta(comp)¶
Compute Yang’s delta parameter
\sqrt{\sum^n_{i=1} c_i \left( 1 - \frac{r_i}{\bar{r}} \right)^2 }
where c_i and r_i are the fraction and radius of element i, and \bar{r} is the fraction-weighted average of the radii. We use the radii compiled by .. Miracle et al. https://www.tandfonline.com/doi/ref/10.1179/095066010X12646898728200?scroll=top.
- Args:
comp (Composition) - Composition to assess
- Returns:
(float) delta
- compute_omega(comp)¶
Compute Yang’s mixing thermodynamics descriptor
\frac{T_m \Delta S_{mix}}{ | \Delta H_{mix} | }
Where T_m is average melting temperature, \Delta S_{mix} is the ideal mixing entropy, and \Delta H_{mix} is the average mixing enthalpies of all pairs of elements in the alloy
- Args:
comp (Composition) - Composition to featurizer
- Returns:
(float) Omega
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Main featurizer function, which has to be implemented in any derived featurizer subclass.
- Args:
x: input data to featurize (type depends on featurizer).
- Returns:
(list) one or more features.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- precheck(c: Composition) bool ¶
Precheck a single entry. YangSolidSolution does not work for compositions containing any binary element combinations for which the model has no parameters. We can nearly equivalently approximate this by checking against the unary element list.
To precheck an entire dataframe (and automatically gather the fraction of structures that will pass the precheck), please use precheck_dataframe.
- Args:
c (pymatgen.Composition): The composition to precheck.
- Returns:
(bool): If True, s passed the precheck; otherwise, it failed.
matminer.featurizers.composition.composite module¶
Composition featurizers for composite features containing more than 1 category of general-purpose data.
- class matminer.featurizers.composition.composite.ElementProperty(data_source, features, stats, impute_nan=False)¶
Bases:
BaseFeaturizer
Class to calculate elemental property attributes.
To initialize quickly, use the from_preset() method.
Features: Based on the statistics of the data_source chosen, computed by element stoichiometry. The format generally is:
“{data source} {statistic} {property}”
For example:
“PymatgenData range X” # Range of electronegativity from Pymatgen data
For a list of all statistics, see the PropertyStats documentation; for a list of all attributes available for a given data_source, see the documentation for the data sources (e.g., PymatgenData, MagpieData, MatscholarElementData, etc.).
- Args:
- data_source (AbstractData or str): source from which to retrieve
element property data (or use str for preset: “pymatgen”, “magpie”, or “deml”)
- features (list of strings): List of elemental properties to use
(these must be supported by data_source)
- stats (list of strings): a list of weighted statistics to compute to for each
property (see PropertyStats for available stats)
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- __init__(data_source, features, stats, impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get elemental property attributes
- Args:
comp: Pymatgen composition object
- Returns:
all_attributes: Specified property statistics of features
- classmethod from_preset(preset_name, impute_nan=False)¶
Return ElementProperty from a preset string Args:
- preset_name: (str) can be one of “magpie”, “deml”, “matminer”,
“matscholar_el”, or “megnet_el”.
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- Returns:
ElementProperty based on the preset name.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.composite.Meredig(impute_nan=False)¶
Bases:
BaseFeaturizer
Class to calculate features as defined in Meredig et. al.
- Features:
Atomic fraction of each of the first 103 elements, in order of atomic number. 17 statistics of elemental properties;
Mean atomic weight of constituent elements Mean periodic table row and column number Mean and range of atomic number Mean and range of atomic radius Mean and range of electronegativity Mean number of valence electrons in each orbital Fraction of total valence electrons in each orbital
- Args:
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- __init__(impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get elemental property attributes
- Args:
comp: Pymatgen composition object
- Returns:
all_attributes: Specified property statistics of features
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
matminer.featurizers.composition.element module¶
Composition featurizers for elemental data and stoichiometry.
- class matminer.featurizers.composition.element.BandCenter(impute_nan=False)¶
Bases:
BaseFeaturizer
Estimation of absolute position of band center using electronegativity.
- Features
Band center
- __init__(impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
(Rough) estimation of absolute position of band center using geometric mean of electronegativity.
- Args:
comp (Composition).
- Returns:
(float) band center.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.element.ElementFraction¶
Bases:
BaseFeaturizer
Class to calculate the atomic fraction of each element in a composition.
Generates a vector where each index represents an element in atomic number order.
- __init__()¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
comp: Pymatgen Composition object
- Returns:
vector (list of floats): fraction of each element in a composition
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.element.Stoichiometry(p_list=(0, 2, 3, 5, 7, 10), num_atoms=False)¶
Bases:
BaseFeaturizer
Calculate norms of stoichiometric attributes.
- Parameters:
p_list (list of ints): list of norms to calculate num_atoms (bool): whether to return number of atoms per formula unit
- __init__(p_list=(0, 2, 3, 5, 7, 10), num_atoms=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get stoichiometric attributes Args:
comp: Pymatgen composition object p_list (list of ints)
- Returns:
- p_norm (list of floats): Lp norm-based stoichiometric attributes.
Returns number of atoms if no p-values specified.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.element.TMetalFraction¶
Bases:
BaseFeaturizer
Class to calculate fraction of magnetic transition metals in a composition.
- Parameters:
data_source (data class): source from which to retrieve element data
Generates: Fraction of magnetic transition metal atoms in a compound
- __init__()¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
comp: Pymatgen Composition object
- Returns:
frac_magn_atoms (single-element list): fraction of magnetic transitional metal atoms in a compound
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
matminer.featurizers.composition.ion module¶
Composition featurizers for compositions with ionic data.
- class matminer.featurizers.composition.ion.CationProperty(data_source, features, stats, impute_nan=False)¶
Bases:
ElementProperty
Features based on properties of cations in a material
Requires that oxidation states have already been determined. Property statistics weighted by composition.
Features: Based on the statistics of the data_source chosen, computed by element stoichiometry. The format generally is:
“{data source} {statistic} {property}”
For example:
“DemlData range magn_moment” # Range of magnetic moment via Deml et al. data
For a list of all statistics, see the PropertyStats documentation; for a list of all attributes available for a given data_source, see the documentation for the data sources (e.g., PymatgenData, MagpieData, MatscholarElementData, etc.).
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Get elemental property attributes
- Args:
comp: Pymatgen composition object
- Returns:
all_attributes: Specified property statistics of features
- classmethod from_preset(preset_name, impute_nan=False)¶
Return ElementProperty from a preset string Args:
- preset_name: (str) can be one of “magpie”, “deml”, “matminer”,
“matscholar_el”, or “megnet_el”.
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- Returns:
ElementProperty based on the preset name.
- class matminer.featurizers.composition.ion.ElectronAffinity(impute_nan=False)¶
Bases:
BaseFeaturizer
Calculate average electron affinity times formal charge of anion elements. Note: The formal charges must already be computed before calling featurize. Generates average (electron affinity*formal charge) of anions.
- Args:
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- __init__(impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
comp: (Composition) Composition to be featurized
- Returns:
avg_anion_affin (single-element list): average electron affinity*formal charge of anions
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.ion.ElectronegativityDiff(stats=None)¶
Bases:
BaseFeaturizer
Features from electronegativity differences between anions and cations.
These features are computed by first determining the concentration-weighted average electronegativity of the anions. For example, the average electronegativity of the anions in CaCoSO is equal to 1/2 of that of S and 1/2 of that of O. We then compute the difference between the electronegativity of each cation and the average anion electronegativity.
The feature values are then determined based on the concentration-weighted statistics in the same manner as ElementProperty features. For example, one value could be the mean electronegativity difference over all the anions.
- Parameters:
stats: Property statistics to compute
Generates average electronegativity difference between cations and anions
- __init__(stats=None)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
comp: Pymatgen Composition object
- Returns:
en_diff_stats (list of floats): Property stats of electronegativity difference
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.ion.IonProperty(data_source=None, impute_nan=False, fast=False)¶
Bases:
BaseFeaturizer
Ionic property attributes. Similar to ElementProperty.
- __init__(data_source=None, impute_nan=False, fast=False)¶
- Args:
- data_source - (OxidationStateMixin) - A AbstractData class that supports
the get_oxidation_state method.
- fast - (boolean) whether to assume elements exist in a single oxidation state,
which can dramatically accelerate the calculation of whether an ionic compound is possible, but will miss heterovalent compounds like Fe3O4.
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Ionic character attributes
- Args:
comp: (Composition) Composition to be featurized
- Returns:
cpd_possible (bool): Indicates if a neutral ionic compound is possible max_ionic_char (float): Maximum ionic character between two atoms avg_ionic_char (float): Average ionic character
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.ion.OxidationStates(stats=None)¶
Bases:
BaseFeaturizer
Statistics about the oxidation states for each specie. Features are concentration-weighted statistics of the oxidation states.
- __init__(stats=None)¶
- Args:
stats - (list of string), which statistics compute
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Main featurizer function, which has to be implemented in any derived featurizer subclass.
- Args:
x: input data to featurize (type depends on featurizer).
- Returns:
(list) one or more features.
- classmethod from_preset(preset_name)¶
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- matminer.featurizers.composition.ion.is_ionic(comp)¶
Determines whether a compound is an ionic compound.
Looks at the oxidation states of each site and checks if both anions and cations exist
- Args:
comp (Composition): Composition to check
- Returns:
(bool) Whether the composition describes an ionic compound
matminer.featurizers.composition.orbital module¶
Composition featurizers for orbital data.
- class matminer.featurizers.composition.orbital.AtomicOrbitals¶
Bases:
BaseFeaturizer
Determine HOMO/LUMO features based on a composition.
The highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) are estiated from the atomic orbital energies of the composition. The atomic orbital energies are from NIST: https://www.nist.gov/pml/data/atomic-reference-data-electronic-structure-calculations
Warning: For compositions with inter-species fractions greater than 10,000 (e.g. dilute alloys such as FeC0.00001) the composition will be truncated (to Fe in this example). In such extreme cases, the truncation likely reflects the true physics of the situation (i.e. that the dilute element does not significantly contribute orbital character to the band structure), but the user should be aware of this behavior.
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
- comp: (Composition)
pymatgen Composition object
- Returns:
HOMO_character: (str) orbital symbol (‘s’, ‘p’, ‘d’, or ‘f’) HOMO_element: (str) symbol of element for HOMO HOMO_energy: (float in eV) absolute energy of HOMO LUMO_character: (str) orbital symbol (‘s’, ‘p’, ‘d’, or ‘f’) LUMO_element: (str) symbol of element for LUMO LUMO_energy: (float in eV) absolute energy of LUMO gap_AO: (float in eV)
the estimated bandgap from HOMO and LUMO energeis
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.orbital.ValenceOrbital(orbitals=('s', 'p', 'd', 'f'), props=('avg', 'frac'), impute_nan=False)¶
Bases:
BaseFeaturizer
Attributes of valence orbital shells
- Args:
data_source (data object): source from which to retrieve element data orbitals (list): orbitals to calculate props (list): specifies whether to return average number of electrons in each orbital,
fraction of electrons in each orbital, or both
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- __init__(orbitals=('s', 'p', 'd', 'f'), props=('avg', 'frac'), impute_nan=False)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Weighted fraction of valence electrons in each orbital
- Args:
comp: Pymatgen composition object
- Returns:
- valence_attributes (list of floats): Average number and/or
fraction of valence electrons in specified orbitals
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
matminer.featurizers.composition.packing module¶
Composition featurizers for determining packing characteristics.
- class matminer.featurizers.composition.packing.AtomicPackingEfficiency(threshold=0.01, n_nearest=(1, 3, 5), max_types=6, impute_nan=False)¶
Bases:
BaseFeaturizer
Packing efficiency based on a geometric theory of the amorphous packing of hard spheres.
This featurizer computes two different kinds of the features. The first relate to the distance between a composition and the composition of the clusters of atoms expected to be efficiently packed based on a theory from `Laws et al.<http://www.nature.com/doifinder/10.1038/ncomms9123>`_. The second corresponds to the packing efficiency of a system if all atoms in the alloy are simultaneously as efficiently-packed as possible.
The packing efficiency in these models is based on the Atomic Packing Efficiency (APE), which measures the difference between the ratio of the radii of the central atom to its neighbors and the ideal ratio of a cluster with the same number of atoms that has optimal packing efficiency. If the difference between the ratios is too large, the APE is positive. If the difference is too small, the APE is negative.
- Features:
- dist from {k} clusters |APE| < {thr} - The distance between an
alloy composition and the k clusters that have a packing efficiency below thr from ideal
- mean simul. packing efficiency - Mean packing efficiency of all atoms.
The packing efficiency is measured with respect to ideal (0)
- mean abs simul. packing efficiency - Mean absolute value of the
packing efficiencies. Closer to zero is more efficiently packed
- References:
[1] K.J. Laws, D.B. Miracle, M. Ferry, A predictive structural model for bulk metallic glasses, Nat. Commun. 6 (2015) 8123. doi:10.1038/ncomms9123.
- __init__(threshold=0.01, n_nearest=(1, 3, 5), max_types=6, impute_nan=False)¶
Initialize the featurizer
- Args:
- threshold (float):Threshold to use for determining whether
a cluster is efficiently packed.
n_nearest ({int}): Number of nearest clusters to use when considering features max_types (int): Maximum number of atom types to consider when
looking for efficient clusters. The process for finding efficient clusters very expensive for large numbers of types
- impute_nan (bool): if True, the features for the elements
that are missing from the data_source or are NaNs are replaced by the average of each features over the available elements.
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- compute_nearest_cluster_distance(comp)¶
Compute the distance between a composition and that the nearest efficiently-packed clusters.
Measures the mean L_2 distance between the alloy composition and the k-nearest clusters with Atomic Packing Efficiencies within the user-specified tolerance of 1. k is any of the numbers defined in the “n_nearest” parameter of this class.
If there are less than k efficient clusters in the system, we use the maximum distance between any two compositions (1) for the unmatched neighbors.
- Args:
comp (Composition): Composition of material to evaluate
- Return:
[float] Average distances
- compute_simultaneous_packing_efficiency(comp)¶
Compute the packing efficiency of the system when the neighbor shell of each atom has the same composition as the alloy. When this criterion is satisfied, it is possible for every atom in this system to be simultaneously as efficiently-packed as possible.
- Args:
comp (Composition): Composition to be assessed
- Returns
(float) Average APE of all atoms (float) Average deviation of the APE of each atom from ideal (0)
- create_cluster_lookup_tool(elements)¶
Get the compositions of efficiently-packed clusters in a certain system of elements
- Args:
elements ([Element]): Elements in system
- Return:
- (NearNeighbors): Tool to find nearby clusters in this system. None
if there are no efficiently-packed clusters for this combination of elements
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
Main featurizer function, which has to be implemented in any derived featurizer subclass.
- Args:
x: input data to featurize (type depends on featurizer).
- Returns:
(list) one or more features.
- find_ideal_cluster_size(radius_ratio)¶
Get the optimal cluster size for a certain radius ratio
Finds the number of nearest neighbors n that minimizes |1 - rp(n)/r|, where rp(n) is the ideal radius ratio for a certain n and r is the actual ratio.
- Args:
radius_ratio (float): r / r_{neighbor}
- Returns:
(int) number of neighboring atoms for that will be the most efficiently packed. (float) Optimal APE
- get_ideal_radius_ratio(n_neighbors)¶
Compute the idea ratio between the central atom and neighboring atoms for a neighbor with a certain number of nearest neighbors.
Based on work by Miracle, Lord, and Ranganathan.
- Args:
n_neighbors (int): Number of atoms in 1st NN shell
- Return:
(float) ideal radius ratio r / r_{neighbor}
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
matminer.featurizers.composition.thermo module¶
Composition featurizers for thermodynamic properties.
- class matminer.featurizers.composition.thermo.CohesiveEnergy(mapi_key=None)¶
Bases:
BaseFeaturizer
Cohesive energy per atom using elemental cohesive energies and formation energy.
Get cohesive energy per atom of a compound by adding known elemental cohesive energies from the formation energy of the compound.
- Parameters:
- mapi_key (str): Materials API key for looking up formation energy
by composition alone (if you don’t set the formation energy yourself).
- __init__(mapi_key=None)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp, formation_energy_per_atom=None)¶
- Args:
comp: (pymatgen.Composition): A composition formation_energy_per_atom: (float) the formation energy per atom of
your compound. If not set, will look up the most stable formation energy from the Materials Project database.
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).
- class matminer.featurizers.composition.thermo.CohesiveEnergyMP(mapi_key=None)¶
Bases:
BaseFeaturizer
Cohesive energy per atom lookup using Materials Project
- Parameters:
- mapi_key (str): Materials API key for looking up cohesive energy
by composition alone.
- __init__(mapi_key=None)¶
- citations()¶
Citation(s) and reference(s) for this feature.
- Returns:
- (list) each element should be a string citation,
ideally in BibTeX format.
- feature_labels()¶
Generate attribute names.
- Returns:
([str]) attribute labels.
- featurize(comp)¶
- Args:
comp: (str) compound composition, eg: “NaCl”
- implementors()¶
List of implementors of the feature.
- Returns:
- (list) each element should either be a string with author name (e.g.,
“Anubhav Jain”) or a dictionary with required key “name” and other keys like “email” or “institution” (e.g., {“name”: “Anubhav Jain”, “email”: “ajain@lbl.gov”, “institution”: “LBNL”}).