robocrys.condense package

Submodules

robocrys.condense.component module

This module implements functions for handling structure components.

robocrys.condense.component.components_are_isomorphic(component_a, component_b, use_weights=False)[source]

Determines whether the graphs of two components are isomorphic.

Only takes into account graph connectivity and not local geometry (e.g. bond angles and distances).

Parameters:
  • component_a (dict[str, Any]) – The first component.

  • component_b (dict[str, Any]) – The second component.

  • use_weights (bool) – Whether to use the graph edge weights in comparing graphs.

Returns:

Whether the components are isomorphic.

robocrys.condense.component.components_are_vdw_heterostructure(components)[source]

Whether a list of components form a van der Waals heterostructure.

A heterostructure is defined here as a structure with more than one formula inequivalent 2D component.

Parameters:

components (list[dict[str, Any]]) – A list of structure components, generated using pymatgen.analysis.dimensionality.get_structure_components.

Return type:

bool

Returns:

Whether the list of components from a heterostructure.

robocrys.condense.component.filter_molecular_components(components)[source]

Separate list of components into molecular and non-molecular components.

Parameters:

components (list[dict[str, Any]]) – A list of structure components, generated using pymatgen.analysis.dimensionality.get_structure_components.

Return type:

tuple[list[dict[str, Any]], list[dict[str, Any]]]

Returns:

The filtered components as a tuple of (molecular_components, other_components).

robocrys.condense.component.get_component_formula(component, use_iupac_formula=True, use_common_formulas=True)[source]

Gets the reduced formula of a single component.

Parameters:
  • component (dict[str, Any]) – A structure component, generated using pymatgen.analysis.dimensionality.get_structure_components.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

Return type:

str

Returns:

The formula and factor of the component.

robocrys.condense.component.get_component_formula_and_factor(component, use_iupac_formula=True, use_common_formulas=True)[source]

Gets the reduced formula and factor of a single component.

Parameters:
  • component (dict[str, Any]) – A structure component, generated using pymatgen.analysis.dimensionality.get_structure_components.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

Return type:

tuple[str, int]

Returns:

The formula and factor of the component.

robocrys.condense.component.get_formula_from_components(components, molecules_first=False, use_iupac_formula=True, use_common_formulas=True)[source]

Reconstructs a chemical formula from structure components.

The chemical formulas for the individual components will be grouped together. If two components share the same composition, they will be treated as equivalent.

Parameters:
  • components (list[dict[str, Any]]) – A list of structure components, generated using pymatgen.analysis.dimensionality.get_structure_components.

  • molecules_first (bool) – Whether to put any molecules (zero-dimensional components) at the beginning of the formula.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

Return type:

str

Returns:

The chemical formula.

robocrys.condense.component.get_formula_inequiv_components(components, use_iupac_formula=True, use_common_formulas=True)[source]

Gets and counts the inequivalent components based on their formuula.

Note that the counting of compounds is different to in get_sym_inequiv_equivalent. I.e. the count is not the number of components with the same formula. For example, the count of the formula “GaAs” in a system with two Ga2As2 components would be 4.

Parameters:
  • components (list[dict[str, Any]]) – A list of structure components, generated using pymatgen.analysis.dimensionality.get_structure_components, with inc_site_ids=True.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

Return type:

list[dict[str, Any]]

Returns:

A list of the compositionally inequivalent components. Any duplicate components will only be returned once. The component objects are in the same format is given by pymatgen.analysis.dimensionality.get_structure_components but have two additional fields:

  • "count" (int): The number of formula units of this component. Note, this is not the number of components with the same formula. For example, the count of the formula “GaAs” in a system with two Ga2As2 components would be 4.

  • "formula" (list[int]): The reduced formula of the component.

robocrys.condense.component.get_reconstructed_structure(components, simplify_molecules=True)[source]

Reconstructs a structure from a list of components.

Has the option to simplify molecular components into a single site positioned at the centre of mass of the molecular. If using this option, the components must have been generated with inc_molecule_graph=True.

Parameters:
Return type:

Structure

Returns:

The reconstructed structure.

robocrys.condense.component.get_structure_inequiv_components(components, use_structure_graph=False, fingerprint_tol=0.01)[source]

Gets and counts the structurally inequivalent components.

Supports matching through StructureMatcher or by a combined structure graph/ site fingerprint approach. For the latter method, the component data has to have been generated with inc_graph=True.

Parameters:
Return type:

list[dict[str, Any]]

Returns:

A list of the structurally inequivalent components. Any duplicate components will only be returned once. The component objects are in the same format is given by pymatgen.analysis.dimensionality.get_structure_components but have an additional field:

  • "count" (int): The number of times this component appears in the structure.

robocrys.condense.component.get_sym_inequiv_components(components, spg_analyzer)[source]

Gets and counts the symmetrically inequivalent components.

Component data has to have been generated with inc_site_ids=True.

Parameters:
Return type:

list[dict[str, Any]]

Returns:

A list of the symmetrically inequivalent components. Any duplicate components will only be returned once. The component objects are in the same format is given by pymatgen.analysis.dimensionality.get_structure_components but the additional property:

  • "count" (int): The number of times this component appears in the structure.

robocrys.condense.component.get_vdw_heterostructure_information(components, use_iupac_formula=True, use_common_formulas=True, inc_ordered_components=False, inc_intercalants=False)[source]

Gets information about ordering of components in a vdw heterostructure.

Parameters:
  • components (list[dict[str, Any]]) – A list of structure components, generated using pymatgen.analysis.dimensionality.get_structure_components with inc_orientation=True.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

  • inc_ordered_components (bool) – Whether to return a list of the ordered components. If False, just the component formulas will be returned.

  • inc_intercalants (bool) – Whether to return a list of the intercalants. If False, just the intercalant formulas will be returned.

Returns:

  • "repeating_unit" (list[str]): A List of formulas of the smallest repeating series of components. For example. if the structure consists of A and B components ordered as “A B A B A B”, the repeating unit is “A B”.

  • "num_repetitions" (int): The number of repetitions of the repeating unit that forms the overall structure. For example. if the structure consists of A and B components ordered as “A B A B A B”, the number of repetitions is 3.

  • "intercalant_formulas" (list[str]): The formulas of the intercalated compounds.

  • "ordered_components" (list[component]): If inc_ordered_components, a List of components, ordered as they appear in the heteostructure stacking direction.

  • "intercalants" (list[component]: If inc_intercalants, a List of intercalated components.

Return type:

Information on the heterostructure, as an dict with they keys

robocrys.condense.condenser module

This module defines a class for condensing structures into dict representations.

class robocrys.condense.condenser.StructureCondenser(use_conventional_cell=True, near_neighbors=None, mineral_matcher=None, use_symmetry_equivalent_sites=False, symprec=0.01, simplify_molecules=True, use_iupac_formula=True, use_common_formulas=True)[source]

Bases: object

Class to transform a structure into an intermediate dict representation.

Parameters:
  • use_conventional_cell (bool) – Whether to always use the convention cell representation of the structure.

  • near_neighbors (Optional[NearNeighbors]) – A NearNeighbors instance used to calculate the bonding in the structure. For example, one of pymatgen.analysis.local_env.CrystalNN, pymatgen.analysis.local_env.VoronoiNN, etc. Defaults to None, in which case pymatgen.analysis.local_env.CrystalNN will be used.

  • mineral_matcher (Optional[MineralMatcher]) – A MineralMatcher instance. Defaults to None in which case the default MineralMatcher settings will be used. If set to False, no mineral matching will occur.

  • use_symmetry_equivalent_sites (bool) – Whether to use symmetry to determine if sites are inequivalent. If False, the site geometry and (next) nearest neighbor information will be used.

  • symprec (float) – The tolerance used when determining the symmetry of the structure. The symmetry can used both to determine if multiple sites are symmetrically equivalent (if use_symmetry_equivalent_sites is True) and to obtain the symmetry labels for each site.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

  • use_common_formulas (bool) – Whether to use the database of common formulas. The common formula will be used preferentially to the iupac or reduced formula.

condense_structure(structure)[source]

Condenses the structure into an intermediate dict representation.

Parameters:

structure (Structure) – A pymatgen structure object.

Return type:

dict[str, Any]

Returns:

The condensed structure information. The data is formatted as a dict with a fixed set of keys. An up-to-date example of the, the condensed representation of MoS2 given in the documentation. See: robocrystallographer/docs_rst/source/format.rst or https://hackingmaterials.lbl.gov/robocrystallographer/format.html

robocrys.condense.fingerprint module

robocrys.condense.fingerprint.get_fingerprint_distance(structure_a, structure_b)[source]

Gets the euclidean distance between the fingerprints of two structures.

Parameters:
  • structure_a (IStructure | Iterable) – The first structure or fingerprint. Can be provided as a structure or a fingerprint. If provided as a structure, the fingerprint will be calculated first, so generally it is quicker to pre-calculate the fingerprint if comparing against multiple structures in turn.

  • structure_b (IStructure | Iterable) – The second structure or fingerprint. Can be provided as a structure or a fingerprint. If provided as a structure, the fingerprint will be calculated first, so generally it is quicker to pre-calculate the fingerprint if comparing against multiple structures in turn.

Return type:

float

Returns:

The euclidean distance between fingerprints as a numpy.ndarray.

robocrys.condense.fingerprint.get_site_fingerprints(structure, as_dict=True, preset='CrystalNNFingerprint_ops')[source]

Gets the fingerprint for all sites in a structure.

Parameters:
  • structure (IStructure) – A structure.

  • as_dict (bool) – Whether to return the fingerprints as a dictionary of {'op': val}. Defaults to True.

  • preset (str) – The preset to use when calculating the fingerprint. See matminer.featurizers.structure.SiteStatsFingerprint` for more details.

Return type:

list[dict[str, int]] | ndarray

Returns:

The fingerprint for all sites in the structure. If as_dict == True, the data will be returned as a list of dict containing the order parameters as:

[{'op': val}]

for each site. If as_dict == False, the data will be returned as a numoy.ndarray containing the fingerprint for each site as:

[site_index][op_index]

robocrys.condense.fingerprint.get_structure_fingerprint(structure, preset='CrystalNNFingerprint_ops', stats=('mean', 'std_dev'), prototype_match=False)[source]

Gets the fingerprint for a structure.

Parameters:
  • structure (IStructure) – A structure.

  • preset (str) – The preset to use when calculating the fingerprint. See matminer.featurizers.structure.SiteStatsFingerprint` for more details.

  • stats (Optional[tuple[str]]) – The stats to include in fingerprint. See matminer.featurizers.structure.SiteStatsFingerprint` for more details.

  • prototype_match (bool) – Whether to use distance cutoffs and electron negativity differences when calculating the structure fingerprint.

Return type:

ndarray

Returns:

The structure fingerprint as a numpy.ndarray.

robocrys.condense.mineral module

This module provides tools for matching structures to known mineral class.

class robocrys.condense.mineral.MineralMatcher(initial_ltol=0.2, initial_stol=0.3, initial_angle_tol=5.0, use_fingerprint_matching=True, fingerprint_distance_cutoff=0.4, mineral_db=None)[source]

Bases: object

Class to match a structure to a mineral name.

Uses a precomputed database of minerals and their fingerprints, extracted from the AFLOW prototype database. For more information on this database see reference [aflow]:

[aflow] (1,2,3,4)

Mehl, M. J., Hicks, D., Toher, C., Levy, O., Hanson, R. M., Hart, G., & Curtarolo, S. (2017), The AFLOW library of crystallographic prototypes: part 1. Computational Materials Science, 136, S1-S828. doi: 10.1016/j.commatsci.2017.01.017

Parameters:
  • initial_ltol (float) – The fractional length tolerance used in the AFLOW structure matching.

  • initial_stol (float) – The site coordinate tolerance used in the AFLOW structure matching.

  • initial_angle_tol (float) – The angle tolerance used in the AFLOW structure matching.

  • use_fingerprint_matching (bool) – Whether to use the fingerprint distance to match minerals.

  • fingerprint_distance_cutoff (float) – Cutoff to determine how similar a match must be to be returned. The distance is measured between the structural fingerprints in euclidean space.

  • mineral_db (Union[str, Path, DataFrame, None]) – Optional path or pandas .DataFrame object containing the mineral fingerprint database.

get_aflow_matches(structure)[source]

Gets minerals for a structure by matching to AFLOW prototypes.

Overrides pymatgen.analysis.aflow_prototypes.AflowPrototypeMatcher to only return matches to prototypes with known mineral names.

The AFLOW tolerance parameters (defined in the init method) are passed to a pymatgen.analysis.structure_matcher.StructureMatcher object. The tolerances are gradually decreased until only a single match is found (if possible).

The AFLOW structure prototypes are detailed in reference [aflow].

Parameters:

structure (IStructure) – A pymatgen structure to match.

Return type:

Optional[list[dict[str, Any]]]

Returns:

A list of dict, sorted by how close the match is, with the keys ‘type’, ‘distance’, ‘structure’. Distance is the euclidean distance between the structure and prototype fingerprints. If no match was found within the tolerances, None will be returned.

get_best_mineral_name(structure)[source]

Gets the “best” mineral name for a structure.

Uses a combination of AFLOW prototype matching and fingerprinting to get the best mineral name.

The AFLOW structure prototypes are detailed in reference [aflow].

The algorithm works as follows:

  1. Check for AFLOW match. If single match return mineral name.

  2. If multiple matches, return the one with the smallest fingerprint distance.

  3. If no AFLOW match, get fingerprints within tolerance. If there are any matches, take the one with the smallest distance.

  4. If no fingerprints within tolerance, check get fingerprints without constraining the number of species types. If any matches, take the best one.

Parameters:

structure (Structure) – A pymatgen Structure object to match.

Returns:

The mineral name information. Stored as a dict with the keys “type”, “distance”, “n_species_types_match”, corresponding to the mineral name, the fingerprint distance between the prototype and known mineral, and whether the number of species types in the structure matches the number in the known prototype, respectively. If no mineral match is determined, the mineral type will be None. If an AFLOW match is found, the distance will be set to -1.

Return type:

(dict)

get_fingerprint_matches(structure, max_n_matches=None, match_n_sp=True, mineral_name_constraint=None)[source]

Gets minerals for a structure by matching to AFLOW fingerprints.

Only AFLOW prototypes with mineral names are considered. The AFLOW structure prototypes are detailed in reference [aflow].

Parameters:
  • structure (IStructure) – A structure to match.

  • max_n_matches (Optional[int]) – Maximum number of matches to return. Set to None to return all matches within the cutoff.

  • match_n_sp (bool) – Whether the structure and mineral must have the same number of species. Defaults to True.

  • mineral_name_constraint (Optional[str]) – Whether to limit the matching to a specific mineral name.

Return type:

Optional[list[dict[str, Any]]]

Returns:

A list of dict, sorted by how close the match is, with the keys ‘type’, ‘distance’, ‘structure’. Distance is the euclidean distance between the structure and prototype fingerprints. If no match was found within the tolerances, None will be returned.

robocrys.condense.molecule module

This module implements a class to match molecule graphs to molecule names.

Some functionality relies on having a working internet connection.

class robocrys.condense.molecule.MoleculeNamer(use_online_pubchem=True, name_preference=('traditional', 'iupac'))[source]

Bases: object

get_name_from_molecule_graph(molecule_graph)[source]

Gets the name of a molecule from a molecule graph object.

Parameters:

molecule_graph (MoleculeGraph) – A molecule graph.

Return type:

Optional[str]

Returns:

The molecule name if a match is found else None.

get_name_from_pubchem(smiles)[source]

Tries to get the name of a molecule from the Pubchem website.

Parameters:

smiles (str) – A SMILES string.

Return type:

Optional[str]

Returns:

The molecule name if a match is found else None.

static molecule_graph_to_smiles(molecule_graph)[source]

Converts a molecule graph to SMILES string.

Parameters:

molecule_graph (MoleculeGraph) – A molecule graph.

Return type:

Optional[str]

Returns:

The SMILES representation of the molecule.

name_sources = ('traditional', 'iupac')

robocrys.condense.site module

This module provides a class to extract geometry and neighbor information.

class robocrys.condense.site.SiteAnalyzer(bonded_structure, use_symmetry_equivalent_sites=False, symprec=0.01, minimum_geometry_op=0.4, use_iupac_formula=True)[source]

Bases: object

Class to extract information on site geometry and bonding.

symmetry_labels

A dict mapping the site indices to the symmetry label for that site. If two sites are symmetrically equivalent they share the same symmetry label. The numbering begins at 1 for each element in the structure.

equivalent_sites

A list of indices mapping each site in the structure to a symmetrically or structurally equivalent site, depending on the value of use_symmetry_equivalent_sites.

Parameters:
  • bonded_structure (StructureGraph) – A bonded structure with nearest neighbor data included. For example generated using pymatgen.analysis.local_env.CrystalNN or pymatgen.analysis.local_env.VoronoiNN.

  • use_symmetry_equivalent_sites (bool) – Whether to use symmetry to determine if sites are inequivalent. If False, the site geometry and (next) nearest neighbor information will be used.

  • symprec (float) – The tolerance used when determining the symmetry of the structure. The symmetry can used both to determine if multiple sites are symmetrically equivalent and to obtain the symmetry labels for each site.

  • minimum_geometry_op (float) – The minimum geometrical order parameter for a geometry match to be returned.

  • use_iupac_formula (bool, optional) – Whether to order formulas by the iupac “electronegativity” series, defined in Table VI of “Nomenclature of Inorganic Chemistry (IUPAC Recommendations 2005)”. This ordering effectively follows the groups and rows of the periodic table, except the Lanthanides, Actanides and hydrogen. If set to False, the elements will be ordered according to the electronegativity values.

get_all_bond_distance_summaries()[source]

Gets the bond distance summaries for all sites.

Return type:

dict[int, dict[int, list[float]]]

Returns:

The bond distance summaries for all sites, formatted as:

{
    from_site: {
        to_site: distances
    }
}

Where from_site and to_site are site indices and distances is a list of float of bond distances.

get_all_connectivity_angle_summaries()[source]

Gets the connectivity angle summaries for all sites.

The connectivity angles are the angles between a site and its next nearest neighbors.

Return type:

dict[int, dict[int, dict[str, list[float]]]]

Returns:

The connectivity angle summaries for all sites, formatted as:

{
    from_site: {
        to_site: {
            connectivity: angles
        }
    }
}

Where from_site and to_site are the site indices of two sites, connectivity is the connectivity type (e.g. 'edge' or 'face') and angles is a list of float of connectivity angles.

get_all_nnn_distance_summaries()[source]

Gets the next nearest neighbor distance summaries for all sites.

Return type:

dict[int, dict[int, dict[str, list[float]]]]

Returns:

The next nearest neighbor distance summaries for all sites, formatted as:

{
    from_site: {
        to_site: {
            connectivity: distances
        }
    }
}

Where from_site and to_site are the site indices of two sites, connectivity is the connectivity type (e.g. 'edge' or 'face') and distances is a list of float of distances.

get_all_site_summaries()[source]

Gets the site summaries for all sites.

Returns:

The site summaries for all sites, formatted as:

{
    site_index: site_summary
}

Where site_summary has the same format as produced by SiteAnalyzer.get_site_summary().

get_bond_distance_summary(site_index)[source]

Gets the bond distance summary for a site.

Parameters:

site_index (int) – The site index (zero based).

Return type:

dict[int, list[float]]

Returns:

The bonding data for the site, formatted as:

{to_site: [dist_1, dist_2, dist_3, ...]}

Where to_site is the index of a nearest neighbor site and dist_1 etc are the bond distances as float.

get_connectivity_angle_summary(site_index)[source]

Gets the connectivity angle summary for a site.

The connectivity angles are the angles between a site and its next nearest neighbors.

Parameters:

site_index (int) – The site index (zero based).

Return type:

dict[int, dict[str, list[float]]]

Returns:

The connectivity angle data for the site, formatted as:

{
    to_site: {
        connectivity_a: [angle_1, angle_2, ...]
        connectivity_b: [angle_1, angle_2, ...]
    }
}

Where to_site is the index of a next nearest neighbor site, connectivity_a etc are the bonding connectivity type, e.g. 'edge' or 'corner' (for edge-sharing and corner-sharing connectivity), and angle_1 etc are the bond angles as float.

get_inequivalent_site_indices(site_indices)[source]

Gets the inequivalent site indices from a list of site indices.

Parameters:

site_indices (list[int]) – The site indices.

Return type:

list[int]

Returns:

The inequivalent site indices. For example, if a structure has 4 sites where the first two are equivalent and the last two are inequivalent. If site_indices=[0, 1, 2, 3] the output will be:

[0, 0, 2, 3]

get_nearest_neighbors(site_index, inc_inequivalent_site_index=True)[source]

Gets information about the bonded nearest neighbors.

Parameters:
  • site_index (int) – The site index (zero based).

  • inc_inequivalent_site_index (bool) – Whether to include the inequivalent site indices in the nearest neighbor information.

Return type:

list[dict[str, Any]]

Returns:

For each site bonded to site_index, returns a dict with the format:

{'element': el, 'dist': distance}

If inc_inequivalent_site_index=True, the data will have an additional key 'inequiv_index' corresponding to the inequivalent site index. E.g. if two sites are structurally/symmetrically equivalent (depending on the value of self.use_symmetry_equivalent_sites then they will have the same inequiv_index.

get_next_nearest_neighbors(site_index, inc_inequivalent_site_index=True)[source]

Gets information about the bonded next nearest neighbors.

Parameters:
  • site_index (int) – The site index (zero based).

  • inc_inequivalent_site_index (bool) – Whether to include the inequivalent site indices.

Return type:

list[dict[str, Any]]

Returns:

A list of the next nearest neighbor information. For each next nearest neighbor site, returns a dict with the format:

{'element': el, 'connectivity': con, 'geometry': geom,
 'angles': angles, 'distance': distance}

The connectivity property is the connectivity type to the next nearest neighbor, e.g. “face”, “corner”, or “edge”. The geometry property gives the geometry of the next nearest neighbor site. See the get_site_geometry method for the format of this data. The angles property gives the bond angles between the site and the next nearest neighbour. Returned as a list of int. Multiple bond angles are given when the two sites share more than nearest neighbor (e.g. if they are face-sharing or edge-sharing). The distance property gives the distance between the site and the next nearest neighbor. If inc_inequivalent_site_index=True, the data will have an additional key 'inequiv_index' corresponding to the inequivalent site index. E.g. if two sites are structurally/symmetrically equivalent (depending on the value of self.use_symmetry_equivalent_sites then they will have the same inequiv_index.

get_nnn_distance_summary(site_index)[source]

Gets the next nearest neighbor distance summary for a site.

Parameters:

site_index (int) – The site index (zero based).

Return type:

dict[int, dict[str, list[float]]]

Returns:

The connectivity distance data for the site, formatted as:

{
    to_site: {
        connectivity_a: [distance_1, distance_2, ...]
        connectivity_b: [distance_1, distance_2, ...]
    }
}

Where to_site is the index of a next nearest neighbor site, connectivity_a etc are the bonding connectivity type, e.g. 'edge' or 'corner' (for edge-sharing and corner-sharing connectivity), and distance_1 etc are the bond angles as float.

get_site_geometry(site_index)[source]

Gets the bonding geometry of a site.

For example, “octahedral” or “square-planar”.

Parameters:

site_index (int) – The site index (zero based).

Return type:

dict[str, str | float]

Returns:

The site geometry information formatted at as:

{'type': geometry_type, 'likeness': order_parameter}

Where geometry_type is a str corresponding to the geometry type (e.g. octahedral) and order_parameter is a float indicating whether how close the geometry is to the perfect geometry. If the largest geometrical order parameter falls beneath robocrys.site.SiteAnalyzer.minimum_geometry_op, the geometry type will be returned as “X-coordinate”, where X is the coordination number.

get_site_summary(site_index)[source]

Gets a summary of the site information.

Parameters:

site_index (int) – The site index (zero based).

Return type:

dict[str, Any]

Returns:

A summary of the site information, formatted as:

{
    'element': 'Mo4+',
    'geometry': {
        'likesness': 0.5544,
        'type': 'pentagonal pyramidal'
    },
    'nn': [2, 2, 2, 2, 2, 2],
    'nnn': {'edge': [0, 0, 0, 0, 0, 0]},
    'poly_formula': 'S6',
    'sym_labels': (1,)
}

Where element is the species string (if the species has oxidation states, these will be included in the string). The geometry key is the geometry information as produced by SiteAnalyzer.get_site_geometry(). The nn key lists the site indices of the nearest neighbor bonding sites. Note the inequivalent site index is given for each site. The nnn key gives the next nearest neighbor information, broken up by the connectivity to that neighbor. The poly_formula key gives the formula of the bonded nearest neighbors. poly_formula will be None if the site geometry is not in robocrys.util.connected_geometries. The sym_labels key gives the symmetry labels of the site. If two sites are symmetrically equivalent they share the same symmetry label. The numbering begins at 1 for each element in the structure. If SiteAnalyzer.use_symmetry_inequivalnt_sites is False, each site may have more than one symmetry label, as structural features have instead been used to determine the site equivalences, i.e. two sites are symmetrically distinct but share the same geometry, nearest neighbor and next nearest neighbor properties.

robocrys.condense.site.geometries_match(geometry_a, geometry_b, likeness_tol=0.001)[source]

Determine whether two site geometries match.

Geometry data should be formatted the same as produced by robocrys.site.SiteAnalyzer.get_site_geometry().

Parameters:
  • geometry_a (dict[str, Any]) – The first set of geometry data.

  • geometry_b (dict[str, Any]) – The second set of geometry data.

  • likeness_tol (float) – The tolerance used to determine if two likeness parameters are the same.

Return type:

bool

Returns:

Whether the two geometries are the same.

robocrys.condense.site.nn_summaries_match(nn_sites_a, nn_sites_b, bond_dist_tol=0.01, match_bond_dists=True)[source]

Determine whether two sets of nearest neighbors match.

Nearest neighbor data should be formatted the same as produced by robocrys.site.SiteAnalyzer.get_nearest_neighbors().

Parameters:
  • nn_sites_a (list[dict[str, int | str]]) – The first set of nearest neighbors.

  • nn_sites_b (list[dict[str, int | str]]) – The second set of nearest neighbors.

  • bond_dist_tol (float) – The tolerance used to determine if two bond lengths are the same.

  • match_bond_dists (bool) – Whether to consider bond distances when matching.

Return type:

bool

Returns:

Whether the two sets of nearest neighbors match.

robocrys.condense.site.nnn_summaries_match(nnn_sites_a, nnn_sites_b, likeness_tol=0.001, bond_angle_tol=0.1, match_bond_angles=True)[source]

Determine whether two sets of next nearest neighbors match.

Next nearest neighbor data should be formatted the same as produced by robocrys.site.SiteAnalyzer.get_next_nearest_neighbors().

Parameters:
  • nnn_sites_a (list[dict[str, Any]]) – The first set of next nearest neighbors.

  • nnn_sites_b (list[dict[str, Any]]) – The second set of next nearest neighbors.

  • likeness_tol (float) – The tolerance used to determine if two likeness parameters are the same.

  • bond_angle_tol (float) – The tolerance used to determine if two bond angles are the same.

  • match_bond_angles (bool) – Whether to consider bond angles when matching.

Returns:

Whether the two sets of next nearest neighbors match.

Module contents