S3-13 MatCloud+: High-throughput Multiscale Materials Simulation and Machine Leaning on Cloud

MatCloud+: High-throughput Multiscale Materials Simulation and Machine Leaning on Cloud
Xiaoyu Yang
1Computer Network Information Center
2Chinese Academy of Sciences


EXTENDED ABSTRACT: The challenge of using artificial intelligence to help with materials design is the lack of data. While high-throughput multiscale materials simulation can be used to produce materials data, the difficulties of using it to produce data still barrier many users. For example, users have to understand Linux, spend time to find computational resources, and storage resources, etc. In particular, once simulation completes, the core materials data has to be extracted from the simulation result. Also, storing the extracted data into the database for machine learning and well management requires extra work. MatCloud+ is a Cloud-based computational infrastructure for the integrated management of materials simulation, data and computing resources, which is directly connected to a computing cluster and a materials simulation database, integrating the computing facilities, data, various scripts, and simulation code together to automatically manage the creation and running of simulation jobs, the subsequent extraction of core output information, and the longer-term archival of materials properties data. One of important novelties of MatCloud+ is it integrates the high-throughput multiscale materials simulations (e.g. Quantum Espresso, Lammps) together with machine learning on the Cloud. Once simulation completes, the required materials properties have been acquired and preserved in the material property database. The more users use MatCloud+, the more simulation data in MatCloud+ will be accumulated. If users do not wish their data open to public, they can set their data not searchable by others. This talk illustrates challenges of machine learning that integrated high-throughput multiscale materials simulations, and the gains and benefits to users brought by the integration of materials simulation, data, HPC and AI.

Brief Introduction of Speaker
Xiaoyu Yang

Prof. Xiaoyu Yang is currently working at Computer Network Information Center, Chinese Academy of Sciences (CAS). He joined CAS in 2012, and awarded the “100 Talent Program” fellowship of Chinese Academy of Sciences. Prof Yang’s research interests currently focus on Material Genome Initiative, which includes: materials informatics, materials genome initiative informatics, materials simulation and data infrastructure etc. Prof. Yang completed his post-doctoral research at the University of Cambridge, UK in 2008. He was previously a Research Associate in the Department of Earth Sciences and affiliated software engineer in Cambridge e-Science Centre, at the University of Cambridge. He joined School of Electronics and Computer Sciences, University of Southampton, UK in 2008, as a Research Engineer. In 2010, he worked in Reading e-Science Center, University of Reading. Prof. Yang has earned an MSc degree in IT (2001) and a PhD degree in Systems Engineering (2006) from "Faculty of Computing Science and Engineering" at the De Montfort University, UK.