.. raw:: html .. title:: automatminer documentation .. image:: _static/logo.png :alt: server :align: center :width: 600px `Automatminer `_ is a tool for *automatically* creating **complete** machine learning pipelines for materials science, including automatic featurization with `matminer `_, feature reduction, and an AutoML backend. Put in a materials dataset, get out a machine that predicts materials properties. How it works ------------ Automatminer automatically decorates a dataset using hundreds of descriptor techniques from matminer's descriptor library, picks the most useful features for learning, and runs a separate AutoML pipeline. Once a pipeline has been fit, it can be summarized in a text file, saved to disk, or used to make predictions on new materials. .. image:: _static/pipe.png :alt: server :align: center Automatminer uses `pandas `_ dataframes for all of its working objects. Put dataframes in, get dataframes out. .. image:: _static/dataframe_pipe.png :alt: server :align: center :width: 800px Here's an example of training on known data, and extending the model to out of sample data. .. code-block:: python from automatminer.pipeline import MatPipe # Fit a pipeline to training data to predict band gap pipe = MatPipe() pipe.fit(train_df, "band gap") # Predict bandgap of some unknown materials predicted_df = pipe.predict(unknown_df) Overview -------- **Automatminer can work with many kinds of data:** - both computational and experimental data - small (~100 samples) to moderate (~100k samples) sized datasets - crystalline datasets - composition-only (i.e., unknown phases) datasets - datasets containing electronic bandstructures or density of states **Many kinds of target properties:** - electronic - mechanical - thermodynamic - any other kind of property **And many featurization (descriptor) techniques:** See `matminer's Table of Featurizers `_ for a full (and growing) list. **Automatminer is designed to be easy to use and reproducible** - Save pipelines which are portable across machines - Fit a complete pipeline with 1 line of code - Predict on new samples with 1 line of code - Presets for easy setup **Automatminer is automatic and accurate** - No hand tuning required - Comparable in accuracy to hand-tuned models in benchmark tests User manual -------------- .. toctree:: :maxdepth: 2 installation.rst basic.rst advanced.rst datasets.rst tutorials.rst license.rst .. toctree:: :hidden: :maxdepth: 2 Python API What's new? ----------- Track changes to automatminer through the `changelog `_. Contributing / Contact / Support -------------------------------- Want to see something added or changed? Some ways to get involved are: - Help us improve the documentation – tell us where you got stuck and improve the install process for everyone. - Let us know if you'd like to see certain features. - Point us to areas of the code that are difficult to understand or use. - Contribute code! You can do this by forking `Automatminer on Github `_ and submitting a pull request. - Post to our `support forum `_. Don't be shy, we look forward to feedback! See our `contribution guidelines `_ for more inspect. For a list of contributors, see our `GitHub page `_ Citing Automatminer or MatBench -------------------------------- If you find Automatminer or the MatBench benchmarks helpful in your research, please consider citing our `publication in npj Computational Materials `_: .. code-block:: text Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm. npj Computational Materials 6, 138 (2020). https://doi.org/10.1038/s41524-020-00406-3 API documentation ------------------ Autogenerated API documentation. Beware! Only for the brave. - :ref:`modindex` - :ref:`genindex` - :ref:`search`