.. raw:: html
.. title:: automatminer documentation
.. image:: _static/logo.png
:alt: server
:align: center
:width: 600px
`Automatminer `_ is a tool for
*automatically* creating **complete** machine
learning pipelines for materials science, including automatic featurization
with `matminer `_, feature
reduction, and an AutoML backend. Put in a materials dataset, get out a machine
that predicts materials properties.
How it works
------------
Automatminer automatically decorates a dataset using hundreds of descriptor
techniques from matminer's descriptor library, picks the most useful
features for learning, and runs a separate AutoML pipeline.
Once a pipeline has been fit, it can be summarized in a text file, saved to
disk, or used to make predictions on new materials.
.. image:: _static/pipe.png
:alt: server
:align: center
Automatminer uses `pandas `_ dataframes for all of
its working objects. Put dataframes in, get dataframes out.
.. image:: _static/dataframe_pipe.png
:alt: server
:align: center
:width: 800px
Here's an example of training on known data, and extending the model to out of
sample data.
.. code-block:: python
from automatminer.pipeline import MatPipe
# Fit a pipeline to training data to predict band gap
pipe = MatPipe()
pipe.fit(train_df, "band gap")
# Predict bandgap of some unknown materials
predicted_df = pipe.predict(unknown_df)
Overview
--------
**Automatminer can work with many kinds of data:**
- both computational and experimental data
- small (~100 samples) to moderate (~100k samples) sized datasets
- crystalline datasets
- composition-only (i.e., unknown phases) datasets
- datasets containing electronic bandstructures or density of states
**Many kinds of target properties:**
- electronic
- mechanical
- thermodynamic
- any other kind of property
**And many featurization (descriptor) techniques:**
See `matminer's Table of Featurizers `_
for a full (and growing) list.
**Automatminer is designed to be easy to use and reproducible**
- Save pipelines which are portable across machines
- Fit a complete pipeline with 1 line of code
- Predict on new samples with 1 line of code
- Presets for easy setup
**Automatminer is automatic and accurate**
- No hand tuning required
- Comparable in accuracy to hand-tuned models in benchmark tests
User manual
--------------
.. toctree::
:maxdepth: 2
installation.rst
basic.rst
advanced.rst
datasets.rst
tutorials.rst
license.rst
.. toctree::
:hidden:
:maxdepth: 2
Python API
What's new?
-----------
Track changes to automatminer through the `changelog
`_.
Contributing / Contact / Support
--------------------------------
Want to see something added or changed? Some ways to get involved are:
- Help us improve the documentation – tell us where you got stuck and improve
the install process for everyone.
- Let us know if you'd like to see certain features.
- Point us to areas of the code that are difficult to understand or use.
- Contribute code! You can do this by forking
`Automatminer on Github `_
and submitting a pull request.
- Post to our `support forum `_. Don't be shy, we look forward to feedback!
See our `contribution guidelines
`_
for more inspect. For a list of contributors, see our
`GitHub page `_
Citing Automatminer or MatBench
--------------------------------
If you find Automatminer or the MatBench benchmarks helpful in your research,
please consider citing our `publication in npj Computational Materials `_:
.. code-block:: text
Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A. Benchmarking Materials Property Prediction
Methods: The Matbench Test Set and Automatminer Reference Algorithm. npj Computational Materials
6, 138 (2020). https://doi.org/10.1038/s41524-020-00406-3
API documentation
------------------
Autogenerated API documentation. Beware! Only for the brave.
- :ref:`modindex`
- :ref:`genindex`
- :ref:`search`