TY - JOUR
T1 - Uncertainty-driven dynamics for active learning of interatomic potentials
AU - Kulichenko, Maksim
AU - Barros, Kipton
AU - Lubbers, Nicholas
AU - Li, Ying Wai
AU - Messerly, Richard
AU - Tretiak, Sergei
AU - Smith, Justin S.
AU - Nebgen, Benjamin
PY - 2023/3
Y1 - 2023/3
N2 - Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuration. If the uncertainty estimate passes a certain threshold, then the configuration is included in the data set. Here we develop a strategy to more rapidly discover configurations that meaningfully augment the training data set. The approach, uncertainty-driven dynamics for active learning (UDD-AL), modifies the potential energy surface used in molecular dynamics simulations to favor regions of configuration space for which there is large model uncertainty. The performance of UDD-AL is demonstrated for two AL tasks: sampling the conformational space of glycine and sampling the promotion of proton transfer in acetylacetone. The method is shown to efficiently explore the chemically relevant configuration space, which may be inaccessible using regular dynamical sampling at target temperature conditions.
AB - Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuration. If the uncertainty estimate passes a certain threshold, then the configuration is included in the data set. Here we develop a strategy to more rapidly discover configurations that meaningfully augment the training data set. The approach, uncertainty-driven dynamics for active learning (UDD-AL), modifies the potential energy surface used in molecular dynamics simulations to favor regions of configuration space for which there is large model uncertainty. The performance of UDD-AL is demonstrated for two AL tasks: sampling the conformational space of glycine and sampling the promotion of proton transfer in acetylacetone. The method is shown to efficiently explore the chemically relevant configuration space, which may be inaccessible using regular dynamical sampling at target temperature conditions.
UR - http://www.scopus.com/inward/record.url?scp=85149315977&partnerID=8YFLogxK
U2 - 10.1038/s43588-023-00406-5
DO - 10.1038/s43588-023-00406-5
M3 - Article
VL - 3
SP - 230
EP - 239
JO - Nature Computational Science
JF - Nature Computational Science
IS - 3
ER -