6-23. High-throughput cloud platform and enterprise service for drug discovery and development

6-23. High-throughput cloud platform and enterprise service for drug discovery and development

Peiyu Zhang, Yang Liu, Xuekun Shi, Jian Ma

XtalPi Inc. (Shenzhen Jingtai Technology Co., Ltd.)

Floor 4, No. 9, Hualian Industrial Zone, Dalang Street, Longhua District, Shenzhen, China

Abstract: With tightly interwoven quantum physics, artificial intelligence (AI), and high-performance cloud computing algorithms, XtalPi has been developing its Intelligent Digital Drug Discovery and Development (ID4) platform to provide solutions and accurate property predictions for drug design, solid-form selection, and other critical aspects of the drug development. This ID4 platform has accelerated more than 100 new drug pipelines, which greatly improves the R&D efficiency and lowers its cost.

This paper will describe the high-throughput ID4 platform in details, and some typical case studies of service. Figure 1 shows the platform architecture. Platform features include five main parts as follows. 1) We have essentially designed a multi-cloud based cloud architecture, by closely collaborating with top cloud service providers including AWS, Tencent Cloud and Google Cloud. 2) We have successfully setup a new separated computing cluster and scaled up to millions of cores in hours, which speeds up the drug R&D. The job management system could seamlessly integrate with different cloud computing clusters, which ensures the scalability on different clouds while keeping an overall CPU usage above 90%. 3) Our computing architecture follows the principle of cloud design, which has highly customized all algorithms into docker services by automatically shipping to cloud via full-stack DevOps toolkits. 4) We have applied the data lake design in the data management and storage, which helps us to easily visualize the data result and quickly aggregate the data for the further data analysis and AI modeling. With the increasing of data, the machine learning model could be iteratively trained to archive a better performance. 5) We have built the full-stack and compliant information security management system for the high data security in the drug discovery and development, with four main aspects such as cloud security, data security, operation security, and compliance.

Figure 1. The cloud platform architecture for drug discovery and development in XtalPi.


高通量药物研发云平台及企业服务进展

张佩宇,刘阳,师雪坤,马健

深圳晶泰科技有限公司

深圳市龙华区大浪街道华联工业区9号4层

摘要:结合量子物理、人工智能和高性能云计算平台,我们开发了智能数字药物研发平台(ID4)。平台为药物设计、晶型筛选及其他药物开发提供解决方案和药物性质预测。该平台的应用,已使得100多条新药管线得到了加速,从而提高效率,降低了药物的研发成本。

本文将详细介绍高通量药物研发平台,以及典型的服务案例进展。平台架构见图1,该平台特性包括以下五个方面:1) 基于多云的架构模式,在多个公有云之上搭建了一个可全球化调度的计算平台。目前已经接入了AWS、腾讯云、Google Cloud 等多个知名公有云厂商。2) 利用公有云超强的弹性伸缩能力,该计算平台能够在数小时之内创建一个近百万计算核心的超级集群,充分发挥并行计算的能力。而且,自研的任务调度器能够与集群管理系统无缝配合,这不仅让集群的伸缩更智能,还能保证集群的平均利用率在90%左右。3) 计算平台使用云原生的架构理念,全面采用微服务化以及完备的Devops工具链。4) 平台使用数据湖作为数据治理的主要方式,使计算产生的海量数据不仅直接作为计算结果呈现,还能够为数据分析、机器学习等场景所使用。这样,随着数据的积累,药物筛选或晶型预测等算法能够不断地进行训练,从而提升算法能力。5) 药物研发对数据安全和保密性要求很高。我们建立了全栈式合规的信息关系系统,主要包括在四个方面:云安全、数据安全、操作安全和合规。

Brief Introduction of Speaker
张佩宇

2015年参与创立深圳晶泰科技有限公司 (Xtalpi),担任首席科学家,负责公司的研发工作。原为中科院大连化学物理研究所副研究员,结合神经网络和GPU并行等算法,进行量子动力学方法开发。在XtalPi主持开发了公司的高通量药物研发平台,结合人工智能、计算化学和云计算等技术,应用于新药发现和药物固相筛选与设计,降低药物研发成本和提高效率。团队积极推动计算化学、人工智能的产业化应用,为辉瑞、罗氏等全球药企提供完整的药物设计和药物固相筛选的解决方案。

Email: peiyu.zhang@xtalpi.com