S6-15 Machine Learning Accelerated Materials discovery Using Scarce Data

Machine Learning Accelerated Materials discovery Using Scarce Data

Ruihao Yuan1*, Dezhen Xue2, Xiangdong Ding2, Jinshan Li1, Jun Sun2, Turab Lookman3

1 State Key Laboratory of Solidification Processing, Northwestern Polytechnical University, Xi’an 710072, China

2 State Key Laboratory for Mechanical Behavior of Materials, Xi’an Jiaotong University, Xi’an 710049, China

3 AiMaterials Research LLC, Santa Fe, NM 87501, USA

EXTENDED ABSTRACT: Machine learning (ML) has been widely used in materials science to reduce the time and cost for new materials discovery. However, the available material data, especially the experimental data is very limited. ML models trained based on such scarce data often show poor performance, hindering the guide for the selection of "best" candidate for verification from a huge unknown space. Herein, using BaTiO3-based ceramics as prototypical materials, we discuss how the issues in limited materials data can be resolved in view of descriptor design, decision making and domain knowledge: (i) We design a descriptor capable of describing ferroelectric phase transitions. The performance is validated in four data sets and the result shows that the descriptor performs better than the existing descriptors such as tolerance factors in both regression and classification tasks. (ii) We combine the uncertainty quantification and optimization algorithms with the active learning feedback loop, and systematically investigate the efficiency of different optimal algorithms in searching for ceramics with enhanced electrostrain. We find that the strategy balancing "exploration" and "exploitation" performs the best. With such strategy, we synthesize in only 3 iterations a compound with electrostrain 50% higher than the best in the initial data. (iii) Using the prior physical knowledge that the compositions in the crossover region bridging ferroelectric and relaxor can improve energy density, we formulate a classification model and extract the compositions in the crossover region from a huge unknown space. The active learning based search in crossover region is much faster than in the full space. (iv) The limited available data in electrocaloric ceramics motivates us utilize Landau theory to relate the adiabatic temperature change to spontaneous polarization. The candidates are then recommended using the polarization based machine model and optimal algorithm. A composition is synthesized with only 1 iteration, which shows a high adiabatic temperature change and a wide temperature operation window.

Brief Introduction of Speaker
Ruihao Yuan

Ruihao Yuan received the B.S. degree from Hefei University of Technology, Hefei, China, in 2013. Then he earned his Ph.D. degree from Xi’an Jiaotong University, Xi’an, China, in 2019. During his Ph.D. thesis, he spent one year as an intern at the Los Alamos National Laboratory, Los Alamos, NM, USA. He is currently an Associate Professor of materials science at Northwestern Polytechnical University, Xi’an, China. His research interests include materials informatics and he has authored more than 20 peer-reviewed papers.