Cloud Computing for Drug Discovery: The Time is Now


Ashutosh Jogalekar, PhD- Head of Product, OpenEye, Cadence Molecular Sciences

Small molecule discovery is the first of several bottlenecks in a drug development program, which typically lasts about 12 years1 and costs upward of $2.6 billion.2 The goal of discovery is therefore “to deliver one or more clinical candidate molecules, each of which has sufficient evidence of biological activity at a target relevant to a disease as well as sufficient safety and drug-like properties so that it can be entered into human testing.”3 This level of investment in human and economic capital, added to the opportunity costs of delays in approval, provides huge incentives for executing discovery programs efficiently and expeditiously.

Traditional small molecule drug discovery employs physical compound libraries and chemical screens to generate molecular starting points or hits. Through an iterative process, hits are modified chemically and re-tested to arrive at candidate molecules suitable for animal testing and human trials. With the exponential growth in computing power and rapidly falling costs, more and more of drug discovery’s physical assets exist not in -80°C freezers but in silico. Similarly, the wet-chemical binding and inhibition assays once conducted in test tubes or microwells are increasingly replaced by computational resources that predict drug-target interactions to narrow down the number of assays and experiments that have to be run.

As computational methods evolve, so do the hardware and software choices drug discovery organizations face daily. One notable trend has been the migration of software and hardware residing at a development organization’s physical location to a central, off-site repository—either web-hosted or entirely in “the cloud.” This trend began with enterprise-level computing but now encompasses data-centered applications like electronic laboratory notebooks (ELNs) and laboratory information management systems (LIMSs). LIMSs and ELNs have been around since cheap personal computing became available in the 1980s, when 100% of installations were hosted at the users’ sites. By 2019, though, cloud-based LIMSs predominated, with predicted year-to-year growth of more than 17%, compared with 11% for all LIMSs.4

The attractiveness of cloud computing for drug discovery arises from the scale of calculations and models that need to be run, the quantity and complexity of possible drug-target interactions, the advantages to having software-based discovery engines available and accessible to all stakeholders at all times, and the integration of discovery results with a variety of software-based statistical, modeling, and documentation tools.

Until quite recently, computer-aided drug discovery was viewed as just another tool to assist in the selection of “hit” molecules, performing calculations, tabulating and tracking results, and managing data overload. As the components of cloud computing emerged and coalesced through interoperability, information technology evolved from a tool to an active participant in drug discovery. Or, as one commenter has put it, modern drug discovery has evolved from computer-aided to computer-enabled.5

The marriage of cloud computing and drug discovery is possible through:

  • In-silico simulation of select aspects of drug discovery experiments such as protein binding and solubility
  • Interoperability of computational processes with physicochemical assays
  • Quantum mechanics (QM)-based computational methods
  • Artificial intelligence for confirming anticipated trends and for uncovering previously unknown drug-target interactions

Benefits of Cloud Computing

Cloud computing in drug discovery arose from the need to access and consolidate computational tools that have evolved since the late 1980s, particularly given the huge compound libraries involved and the desire to screen those compounds electronically. The cloud is where big data resources—including compound libraries and applications that once were housed exclusively at user sites—will eventually reside. Cloud computing provides the opportunity to draw from any combination of these resources.

Computational approaches are broadly classified as either structure-based or ligand-based. Analogous to physical high-throughput screening, structure-based techniques require knowledge of both the target and ligand or test molecule. These techniques include, for example, ligand docking, homology modeling, and molecular dynamics.6 Ligand-based discovery is more empirical in that it predicts a test molecule’s activity (both “efficacy”-related attributes and toxicity) to molecules known to be active. These techniques include, for example, pharmacophore detection and shape-based analog searches. By serving as a repository for both data and applications, the cloud empowers both approaches.7

The fundamental rationale for computational discovery methods is the large number—millions or billions—of library entries and the nearly infinite number and types of interactions they are likely to encounter with targets. When hosted in the cloud these resource-intensive processes become available across time and space and are scheduled to maximize CPU and GPU utilization.

More significantly, and unlike siloed applications, cloud computing allows users to access and utilize custom, configurable computing resources as networked services on demand.8 As such it addresses two problems with traditional drug discovery.

By accessing deeper computational methods, cloud applications help improve discovery efficiency by selecting test compounds, whether they exist physically or only within virtual libraries, with superior target binding and drug-like properties. The advantages are not just speed and operational efficiency, but the discovery of novel chemical space, which is particularly useful for targets that have traditionally been difficult to “drug.”9

The significance of covering novel chemical space cannot be overstated, as existing libraries (again, whether housed in thousands of microplates or as billions of computer entries) have already been extensively mined. For example, conventional discovery has produced very few usable target-ligand combinations for G protein-coupled receptors. In the cloud, combining homology modeling with molecular dynamics provides structural data which, combined with cloud-based quantum chemistry applications and advanced searching of novel chemical space, can lead to promising candidate molecules for hitherto undruggable targets.10

Advanced Tools

The lure of computational approaches for drug discovery is the potential for a more rapid or accurate understanding of physical ligand-target interactions than might be achieved through actual experimentation. Numerous inputs are required, including modeling based on molecular characteristics (e.g. shape, local charge, lipophilicity, steric factors) and ligand-target complementarity. Crystal structures were once the only way to do this but crystallography is expensive and not every ligand-receptor pair can be easily crystallized. Computer-aided drug design (CADD) reproduces these events in software and has been successfully deployed in discovery programs for several highly successful drugs.11

Early-generation CADD worked its magic through representations of ligands and substrates in more-or-less conventional terms using parameters like molecular topography and hydrophobicity. Second-generation approaches, referred to as molecular mechanical methods, incorporate conformational potential energies and electrostatics into their algorithms, thus adding another layer to interaction analysis.

The major benefit of cloud computing to drug discovery, then, is ready access to a host of on-demand software resources that can vastly improve the scale and accuracy of computer-aided drug design techniques and which users could otherwise acquire only at great expense and effort.

Among these advanced tools are artificial intelligence (AI), machine learning, and advanced quantum computing. AI is defined as “a system’s ability to interpret external data correctly, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation.”12 AI is an umbrella term encompassing computational methods that simulate human intelligence, especially pattern matching.

Machine learning, a subset of AI, encompasses software through which a computer learns to predict similar or novel patterns in data based on training on large datasets.13 In a drug discovery setting this could mean, for example, correctly correlating “druggability” with ligand-target interactions that were previously thought to be minor or irrelevant. The goals of a machine learning system are to describe or explain phenomena, predict what will occur when inputs are changed, and prescribe specific actions to achieve desired goals.

AI has the potential to supercharge and extract maximum actionable information from both experimental and theoretical cloud-based discovery applications, including those that are not natively interrelated or interoperable. These include data from instrumental methods (crystallography, nuclear magnetic resonance, etc.), that can be used as inputs for docking, molecular dynamics, target- and ligand[1]based pharmacophore characterization, quantitative structure-activity relationships (QSAR), and similarity search.14 All these applications exist in both “desktop” and cloud formats, but through the cloud users can access them 24/7 without the need to purchase or maintain hardware resources at their site.

Application modularity, which allows users to try or select from a host of applications purchased and maintained by a service company, becomes an even more compelling case for emerging discovery applications, particularly those involving resource-intensive quantum chemistry calculations. While molecular mechanics (MM) is based on widely understood classical mechanics,15 emerging QM approaches are not. QM takes CADD and MM to a new level where chemical bonds are ignored. Interacting species are instead described in terms of atomic nuclei and electron clouds. QM methods provide the most accurate representation of what is occurring in ligand-target interactions, including parameters like vibrational frequencies, equilibrium molecular structure, dipole moments, and reaction-free energies, many of which aren’t easily accessible experimentally.16

QM is extremely resource-intensive and limited in terms of the system complexity it can handle. To provide it with a broader scope and greater accessibility, developers have combined it with MM and CADD to generate hybrid computational systems. Hybrid approaches, for example, QM-MM,17 QM virtual screening,18 and QM-QSAR,19 are extremely complex and require dedicated staff for implementation and maintenance, which is why they are best accessed through a cloud-based service.

Cloud computing addresses a fundamental limitation of CADD. As discovery organizations learned with LIMSs and ELNs, individual site-based deployments are highly resource-intensive and limited to whatever installed applications are available. Cloud-based services do not require users to purchase, install, or maintain applications on-site.

Security is arguably the primary concern for pharmaceutical companies considering cloud computing. While cloud service providers have not solved all security issues, the root causes of security breaches are fairly well understood. According to a report by McKinsey Digital, “Almost all breaches in the cloud stem from misconfiguration, rather than from attacks that compromise the underlying cloud infrastructure.”20

Companies therefore “must adopt new security architectures and processes to protect their cloud workloads.” Industrial consulting group Gartner goes even further in stating that “99% of cloud security failures will be the customer’s fault.”21 In other words, actual security breaches overwhelmingly arise from users’ failure to assess risks appropriately and to follow industry-standard cloud security practices.

Rather than viewing data security as a necessary “feature” of, or add-on to cloud discovery services, potential users need to consider security itself as a service. From this perspective one can pose similar questions around data safety as one would ask about the cloud-resident discovery application: Do we have the resources to implement a robust security plan ourselves, in-house? Can a cloud service provider do a better job than we can? A typical discovery organization will probably answer “no” and “yes,” respectively, which essentially eliminates the security conundrum.

Conclusion

CADD has evolved from simple modeling, docking, and archiving capabilities to a collection of applications involving very high dollar, computational, and human resource overhead. Cloud computing is the most efficient, cost-effective way to leverage advanced computational applications, including hybrid quantum methods, without the need to source, install, and maintain those applications onsite. The cloud also provides a means of interrogating billion-compound virtual libraries for novel chemical space. Rather than perceiving the cloud as a potential source of data breaches, cloud users increasingly view the security provided by cloud-based services as an improvement over in-house data security -- a bonus if you will.

References

  1. DiMasi JA, Feldman L, Seckler A, Wilson A. Trends in risks associated with new drug development: success rates for investigational drugs. Clin Pharmacol Ther. 2010 Mar;87(3):272-7. doi: 10.1038/clpt.2009.295. Epub 2010 Feb 3. PMID: 20130567.
  2. DiMasi JA, Grabowski HG, Hansen RW. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Econ. 2016 May;47:20-33. doi: 10.1016/j. jhealeco.2016.01.012. Epub 2016 Feb 12. PMID: 26928437.
  3. Mohs RC, Greig NH. Drug discovery and development: Role of basic biological research. Alzheimers Dement (N Y). 2017 Nov 11;3(4):651-657. doi: 10.1016/j.trci.2017.10.005. PMID: 29255791; PMCID: PMC5725284.
  4. Abdalslam. LIMS Systems Statistics, Trends, And Facts 2023. Accessed May 10, 2023, at https://abdalslam.com/lims-systems-statistics.
  5. Frye, L., Bhat, S,; Akinsanya, K,; Abel, R. From computer-aided drug discovery to computer-driven drug discovery. Drug Discov. Today Technol. 2021, 39, 111– 117, DOI: 10.1016/j. ddtec.2021.08.001.
  6. Sliwoski G, Kothiwale S, Meiler J, Lowe EW Jr. Computational methods in drug discovery. Pharmacol Rev. 2013 Dec 31;66(1):334-95. doi: 10.1124/pr.112.007336. PMID: 24381236; PMCID: PMC3880464.
  7. Spjuth O, Frid J, Hellander A. (2021) The machine learning life cycle and the cloud: implications for drug discovery, Expert Opinion on Drug Discovery, 16:9, 1071-1079, DOI: 10.1080/17460441.2021.1932812.
  8. Grance PMMT. NIST Definition of Cloud Computing. Special Publication (NIST SP) - 800- 145; 2011.
  9. Addison E, Keinan S. Using Quantum Molecular Design & Cloud Computing to Improve the Accuracy & Success Probability of Drug Discovery. Drug Development & Delivery. March 2016, Vol 16 No 2.
  10. Sekharan S, Wei JN, Batista VS. The active site of melanopsin: the biological clock photoreceptor. J Am Chem Soc. 2012 Dec 5;134(48):19536-9. doi: 10.1021/ja308763b. Epub 2012 Nov 19. PMID: 23145979.
  11. Druker BJ, Lydon NB. Lessons learned from the development of an abl tyrosine kinase inhibitor for chronic myelogenous leukemia. J Clin Invest. 2000 Jan;105(1):3-7. doi: 10.1172/JCI9083. PMID: 10619854; PMCID: PMC382593.
  12. 12. Haenlein, M., & Kaplan, A. (2019). A Brief History of Artificial Intelligence: On the Past, Present, and Future of Artificial Intelligence. California Management Review, 61(4), 5–14. https://doi.org/10.1177/0008125619864925.
  13. MIT Management. Machine Learning Explained. Sara Brown, April 2021. Accessed on May 11, 2023, at https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained.
  14. Yu W, MacKerell AD Jr. Computer-Aided Drug Design Methods. Methods Mol Biol. 2017;1520:85-106. doi: 10.1007/978-1-4939-6634-9_5. PMID: 27873247; PMCID: PMC5248982.
  15. Arodola OA, Soliman ME. Quantum mechanics implementation in drug-design workflows: does it really help? Drug Des Devel Ther. 2017 Aug 31;11:2551-2564. doi: 10.2147/DDDT. S126344. Erratum in: Drug Des Devel Ther. 2017 Nov 08;11:3205. PMID: 28919707; PMCID: PMC5587087.
  16. Arodola OA, Soliman ME. Quantum mechanics implementation in drug-design workflows: does it really help? Drug Des Devel Ther. 2017 Aug 31;11:2551-2564. doi: 10.2147/DDDT. S126344. Erratum in: Drug Des Devel Ther. 2017 Nov 08;11:3205. PMID: 28919707; PMCID: PMC5587087.
  17. Kar R. Benefits of hybrid QM/MM over traditional classical mechanics in pharmaceutical systems. Drug Discovery Today, Volume 28, Issue 1, 2023. https://doi.org/10.1016/j. drudis.2022.103374.
  18. Dalal, V., Dhankhar, P., Singh, V. et al. Structure-Based Identification of Potential Drugs Against FmtA of Staphylococcus aureus: Virtual Screening, Molecular Dynamics, MM-GBSA, and QM/ MM. Protein J 40, 148–165 (2021). https://doi.org/10.1007/s10930-020-09953-6.
  19. Kulkarni U. Prajakta, Shah Harshil and Vyas K. Vivek*, Hybrid Quantum Mechanics/ Molecular Mechanics (QM/MM) Simulation: A Tool for Structure-Based Drug Design and Discovery, Mini-Reviews in Medicinal Chemistry 2022; 22(8). https://dx.doi.org/10.2174/ 1389557521666211007115250
  20. McKinsey Digital. Security as code: The best (and maybe only) path to securing cloud applications and systems. Accessed here on May 12, 2023.
  21. Gartner. Is the cloud secure? October 10, 2019. Accessed here on May 12, 2023.
  22. Rai, Brajesh K; Sresht, Vishnu; Yang, Qingyi; Unwalla, Ray; Tu, Meihua; Mathiowetz, Alan M.; and Bakken, Gregory A. Comprehensive Assessment of Torsional Strain in Crystal Structures of Small Molecules and Protein-ligand Complexes using ab initio Calculations. J. Chem. Inf. Model. 2019, 59, 10, 4195–4208. https://pubs.acs.org/doi/abs/10.1021/acs. jcim.9b00373.
  23. Grebner, Christoph; Malmerberg, Erik; Shewmaker, Andrew; Batista, Jose; Nicholls, Anthony;and Sadowski, Jens. Virtual screening in the cloud: How big is big enough? J. Chem. Inf. Model. 2020, 60, 9, 4274–4282. https://pubs.acs.org/doi/10.1021/acs. jcim.9b00779.

Author Biography

Ashutosh Jogalekar is a computational chemist with experience in preclinical drug discovery. Ashutosh has worked for more than a decade in several startups where he applied his expertise to identify and optimize small molecule compounds for drug development. He also has worked in product management, where he oversaw the development of software solutions for drug discovery cloud labs. Currently, Ashutosh serves as the Head of Product at OpenEye, Cadence Molecular Sciences. In this role, he leads the development of innovative software solutions at scale to help drug discovery scientists accelerate their research. Ashutosh holds a PhD in organic chemistry from Emory University.

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and events. Plus, get special
offers from American Pharmaceutical Review delivered to your inbox!
Sign up now!

  • <<
  • >>

Join the Discussion