Modules

catkit.build module

catkit.build.add_adsorbate(atoms)[source]

Add an adsorbate to a surface.

catkit.build.bulk(name, crystalstructure=None, primitive=False, **kwargs)[source]

Return the standard conventional cell of a bulk structure created using ASE. Accepts all keyword arguments for the ase bulk generator.

Parameters:
  • name (Atoms | str) – Chemical symbol or symbols as in ‘MgO’ or ‘NaCl’.
  • crystalstructure (str) – Must be one of sc, fcc, bcc, hcp, diamond, zincblende, rocksalt, cesiumchloride, fluorite or wurtzite.
  • primitive (bool) – Return the primitive unit cell instead of the conventional standard cell.
Returns:

standardized_bulk – The conventional standard or primitive bulk structure.

Return type:

Gratoms object

catkit.build.molecule(species, topology=None, adsorption=False, vacuum=0)[source]

Return gas-phase molecule structures based on species and topology.

Parameters:
  • species (str) – The chemical symbols to construct a molecule from.
  • topology (int, str, or slice) – The indices for the distinct topology produced by the generator.
  • adsorption (bool) – Construct the molecule as though it were adsorbed to a surface parallel to the z-axis.
  • vacuum (float) – Angstroms of vacuum to pad the molecule with.
Returns:

images – 3D structures of the requested chemical species and topologies.

Return type:

list of objects

catkit.build.surface(elements, size, crystal='fcc', miller=(1, 1, 1), termination=0, fixed=0, vacuum=10, **kwargs)[source]

A helper function to return the surface associated with a given set of input parameters to the general surface generator.

Parameters:
  • elements (str or object) – The atomic symbol to be passed to the as bulk builder function or an atoms object representing the bulk structure to use.
  • size (list (3,)) – Number of time to expand the x, y, and z primitive cell.
  • crystal (str) – The bulk crystal structure to pass to the ase bulk builder.
  • miller (list (3,) or (4,)) – The miller index to cleave the surface structure from. If 4 values are used, assume Miller-Bravis convention.
  • termination (int) – The index associated with a specific slab termination.
  • fixed (int) – Number of layers to constrain.
  • vacuum (float) – Angstroms of vacuum to add to the unit cell.
Returns:

slab – Return a slab generated from the specified bulk structure.

Return type:

Gratoms object

catkit.enumeration module

catkit.enumeration.surfaces(bulk, width, miller_indices=(1, 1, 1), terminations=None, sizes=None, vacuum=10, fixed=0, layer_type='angs', **kwargs)[source]

Return a list of enumerated surfaces based on symmetry properties of interest to the user. Any bulk structure provided will be standardized.

This function will take additional keyword arguments for the catkit.gen.surface.SlabGenerator() Class.

Parameters:
  • bulk (str | Atoms) – The atomic symbol to be passed to the as bulk builder function or an atoms object representing the bulk structure to use.
  • width (float) – Minimum width of the slab in angstroms before trimming. Imposing symmetry requirements will reduce the width.
  • miller_indices (int | list (3,) | list of list (n, 3)) – List of the miller indices to enumerate slabs for. If an integer is provided, the value is treated as the maximum miller index to consider for an enumeration of all possible unique miller indices.
  • terminations (int | array_like) – Return the terminations associated with the provided indices. If -1, all possible terminations are enumerated.
  • sizes (None | int | array_like (n,)) – Enumerate all surface sizes in the provided list. Sizes are integers which represent multiples of the smallest possible surface area. If None, return slabs with the smallest possible surface area. If an integer, enumerate all sizes up to that multiple.
  • vacuum (float) – Angstroms of vacuum to add to the unit cell.
  • fixed (int) – Number of layers to constrain.
  • layer_type ('angs' | 'trim' | 'stoich' | 'sym') – Method of slab layering to perform. See also: catkit.gen.surface.SlabGenerator()
Returns:

slabs – Return a list of enumerated slab structures.

Return type:

list of Gratoms objects

catkit.db module

class catkit.db.Atom(structure, number, coordinates, constraints, magmom, charge)[source]

Bases: sqlalchemy.ext.declarative.api.Base

atom_result
charge
element
element_id
id
magmom
structure
structure_id
x_coordinate
x_fixed
y_coordinate
y_fixed
z_coordinate
z_fixed
class catkit.db.Atom_Result(calculator, atom, forces=None)[source]

Bases: sqlalchemy.ext.declarative.api.Base

atom
atoms_id
calculator
calculator_id
id
x_force
y_force
z_force
class catkit.db.Calculator(name, xc=None, kpoints=None, energy_cutoff=None, parameters=None)[source]

Bases: sqlalchemy.ext.declarative.api.Base

atom_result
energy_cutoff
entry
id
name
parameters
structure_result
x_kpoints
xc
y_kpoints
z_kpoints
class catkit.db.Connect(engine='sqlite:///example.db')[source]

A class for accessing a temporary SQLite database. This function works as a context manager and should be used as follows:

with Connect() as db:
(Perform operation here)
add_entry(images, search_keys=None)[source]
class catkit.db.Element(i)[source]

Bases: sqlalchemy.ext.declarative.api.Base

atom
covalent_radii
id
mass
symbol
class catkit.db.Entry(structure, calculator, trajectory, natoms, energy, fmax, smax=None, search_keys=None)[source]

Bases: sqlalchemy.ext.declarative.api.Base

calculator
calculator_id
energy
fmax
id
natoms
search_keys
smax
structure
structure_id
trajectory
class catkit.db.FingerprintDB(db_name='fingerprints.db', verbose=False)[source]

A class for accessing a temporary SQLite database. This function works as a context manager and should be used as follows:

with FingerprintDB() as fpdb:
(Perform operation here)

This syntax will automatically construct the temporary database, or access an existing one. Upon exiting the indentation, the changes to the database will be automatically committed.

create_table()[source]

Creates the database table framework used in SQLite. This includes 3 tables: images, parameters, and fingerprints.

The images table currently stores ase_id information and a unqiue string. This can be adapted in the future to support atoms objects.

The parameters table stores a symbol (10 character maximum) for convenient reference and a description of the parameter.

The fingerprints table holds a unique image and parmeter ID along with a float value for each. The ID pair must be unique.

fingerprint_entry(ase_id, param_id, value)[source]

Enters a fingerprint value to the database for a given ase and parameter id.

Parameters:
  • ase_id (int) – The unique id associated with an atoms object in the database.
  • param_id (int or str) – The parameter ID or symbol associated with and entry in the parameters table.
  • value (float) – The value of the parameter for the atoms object.
get_fingerprints(ase_ids=None, params=[])[source]

Get the array of values associated with the provided parameters for each ase_id.

Parameters:
  • ase_id (list) – The ase-id associated with an atoms object in the database.
  • params (list) – List of symbols or int in parameters table to be selected.
Returns:

fingerprint – An array of values associated with the given parameters (a fingerprint) for each ase_id.

Return type:

array (n,)

get_parameters(selection=None, display=False)[source]

Get an array of integer values which correspond to the parameter IDs for a set of provided symbols.

Parameters:
  • selection (list) – Symbols in parameters table to be selected. If no selection is made, return all parameters.
  • display (bool) – Print parameter descriptions.
Returns:

parameter_ids – Integer values of selected parameters.

Return type:

array (n,)

image_entry(d, identity=None)[source]

Enters a single ase-db image into the fingerprint database. The ase-db ID with identity must be unique. If not, it will be skipped.

This table can be expanded to contain atoms objects in the future.

Parameters:
  • d (ase-db object) – Database entry to parse.
  • identity (str) – An identifier of the users choice.
Returns:

ase_id – The ase id collected.

Return type:

int

parameter_entry(symbol=None, description=None)[source]

Enters a unique parameter into the database.

Parameters:
  • symbol (str) – A unique symbol the entry can be referenced by. If None, the symbol will be the ID of the parameter as a string.
  • description (str) – A description of the parameter.
class catkit.db.Structure(cell, pbc)[source]

Bases: sqlalchemy.ext.declarative.api.Base

atoms
entry
id
pbc
structure_result
x1_cell
x2_cell
x3_cell
y1_cell
y2_cell
y3_cell
z1_cell
z2_cell
z3_cell
class catkit.db.Structure_Result(calculator, structure, energy, stress=None)[source]

Bases: sqlalchemy.ext.declarative.api.Base

calcualtor_id
calculator
energy
id
structure
structure_id
xx_stress
xy_stress
xz_stress
yy_stress
yz_stress
zz_stress
class catkit.db.Trajectories(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Base

entry_id
id
structure_id

catkit.gratoms module

class catkit.gratoms.Gratoms(symbols=None, positions=None, numbers=None, tags=None, momenta=None, masses=None, magmoms=None, charges=None, scaled_positions=None, cell=None, pbc=None, celldisp=None, constraint=None, calculator=None, info=None, edges=None)[source]

Bases: ase.atoms.Atoms

Graph based atoms object.

An Integrated class for an ASE atoms object with a corresponding Networkx Graph.

adj
connectivity
copy()[source]

Return a copy.

degree
edges
get_chemical_tags(rank=2)[source]

Generate a hash descriptive of the chemical formula (rank 0) or include bonding (rank 1).

get_neighbor_symbols(u)[source]

Get chemical symbols for neighboring atoms of u.

get_surface_atoms()[source]

Return surface atoms.

get_unsaturated_nodes(screen=None)[source]
graph
is_isomorph(other)[source]

Check if isomorphic by bond count and atomic number.

nodes
set_surface_atoms(top, bottom=None)[source]

Assign surface atoms.

catkit.learn module

catkit.learn.online_learning(X, y, samples, factors=[1.0, 1.0], nsteps=40, plot=False)[source]

A simple utility for performing online learning. The main components required are a regression method and a scoring technique.

Currently, the scoring methodology and regressor are baked in. These need to be made modular.

Minimum 3 samples are required for 3 fold cross validation.

catkit.learn.optimizer(obj_func, initial_theta, bounds, gradient=True, minimizer='L-BFGS-B', hopping=0, **kwargs)[source]

Substitute optimizer in scikit-learn Gaussian Process function.

Note ‘L-BFGS-B’ is equivalent to the standard optimizer used in scikit-learn. This function allows for more direct control over the arguments. https://docs.scipy.org/doc/scipy/reference/optimize.html

Parameters:
  • obj_func (function) – scikit-learn objective function.
  • initial_theta (array (n,)) – Hyperparameters to be optimized against.
  • bounds (list of tuples (n, 2)) – Lower and upper bounds for each hyper parameter.
  • gradient (bool) – Include the gradient for the optimization function.
  • minimizer (str) – A scipy minimization method.
  • hopping (int) – Perform a number of basin hopping steps.
Returns:

  • theta_opt (list (n,)) – Optimized hyperparameters.
  • func_min (float) – Value of the minimized objective function.