Applications of Active Learning Method in Thermoelectrics
Jiong Yang1*, Yuexing Han 2, Wenqing Zhang3
1 Materials Genome Institute, Shanghai University, Shanghai 200444,China
2 School of Computer Engineering and Science & Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
3 Department of Physics and Shenzhen Institute for Quantum Science & Technology, Southern University of Science and Technology, 518055 Shenzhen, Guangdong, China
EXTENDED ABSTRACT: The active learning method refers to the machine learning algorithm actively requesting manual labeling of some data filtered by machine learning to improve the performance of the model. The active learning method is particularly suitable for problems with small samples especially for materials research because of the high cost to obtain large amount of samples. The active learning method improves the prediction accuracy of machine learning models, and provides assistance for the discovery of new materials and the reveal of material laws. This report focuses on several application cases of active learning method in the research of thermoelectric materials. In the first case, based on the high-throughput experiment and characterization results of the Cu-Sn-S ternary compounds, in order to quickly and automatically segment the different phases in 99 backscattered electron images, we used a fully connected neural network active learning strategies. This strategy greatly reduces the time cost of image segmentation, and the classification accuracy rate reaches 0.9. At the same time, the introduction of active learning further reduces the workload of manual labeling. Two important non-parent phases are obtained by this strategy. Further experiments show that one of them is the new compound Cu7Sn3S10, which has good thermoelectric properties, and the ZT value exceeds 0.6. The target of the second case is the thermoelectric power factor of diamond-like pnictides and chalcogenides. Our active learning process includes two processes (1) based on high-throughput calculation database, machine learning methods are used to select samples need to be labeled (candidates); (2) perform calculation verification of electrical transport properties on candidates. We add the labeled data to the database and update the model, and iterate the above process to improve the extrapolation ability of the machine learning model. We tested different strategies for selecting candidates, and finally adopted the gradient boosting regression model in the “Query by Committee” (QBC) strategy with the highest extrapolation accuracy (Pearson correlation coefficient on the test set = 0.95). We analyzed the power factor predicted by the machine learning model in the entire search space and found that binary pnictides, chalcogenides containing vacancies and small atomic radius elements may have larger power factors. In the third case, using the similar active learning and QBC methods, combined with adaptive thresholds, we developed a machine learning potential function method and applied them to the prediction of the thermal conductivities of several thermoelectric materials.
REFERENCES
[1] Ye Sheng, Jinyang Xi, Yuexing Han, and Jiong Yang et al., Chemistry of Materials, 33 (2021) 6918
[2] Ye Sheng, Jiong Yang, and Wenqing Zhang et al., NPJ Computational Materials, 6 (2020) 171
[3] Hongliang Yang, Jiong Yang, and Wenqing Zhang et al., Phys. Rev. B, 104 (2021) 094310
Jiong Yang, graduated from the Shanghai Institute of Ceramics, Chinese Academy of Sciences, and worked as a postdoctoral fellow at the University of Washington in the United States. He is currently a professor and doctoral supervisor at the Institute of Materials Genome Engineering, Shanghai University. He has long been engaged in material physics related to electron-phonon interaction, thermoelectric material design and material genome related research, and has published more than 140 papers, H-index 40; he has won the 2019 International Thermoelectric Society Young Scientist Award for his work on thermoelectric material genes; core member of the thermoelectric material direction of two national key research and development plan projects.