Computer-Aided Drug Design Methods – An update

Computer-Aided drug design (CADD) approaches are playing an increasingly important roles in understanding the fundamentals of ligand-receptor interactions and helping medicinal chemists design therapeutics. About five years ago, we presented a chapter devoted to an overview of CADD methods and covered typical CADD protocols including structure-based drug design (SBDD) and ligand-based drug design (LBDD) approaches that were frequently used in the antibiotic drug design process. Advances in computational hardware and algorithms and emerging CADD methods are enhancing the accuracy and ability of CADD in drug design and development. In this chapter an update to our previous chapter is provided with focus on new CADD approaches from our laboratory and other peers that can be employed to facilitate the development of antibiotic therapeutics.

1. Introduction

Following the significant milestone that penicillin represents in human medical history, the battle between humans and bacteria has never settled down and becomes even more vigorous caused by the steady rise of drug resistance. This problem persists despite the availability of large number of antibiotic drugs, indicating the need for more novel antibiotic drug classes to overcome the resistance problem (1, 2). Towards this need, computer-aided drug design (CADD) methods are very helpful tools and have been regularly to study structure and function relationships of antibiotic targets that contribute to drug resistance and to search for new antibiotics, at a relatively cheaper cost compared to using only experimental wet lab methods owing to the powerful modern computational resources (3, 4).

Previously, we published a chapter in the first edition of this book that was dedicated to an overview of CADD and included information on routinely utilized protocols, especially tools used in our laborotary, towards the design of antibotic theraputics (4). Applications of these CADD methods in real life studies were also presented. Since then CADD methods have been employed extensively to facilitate the development of novel antibiotics by the computational chemistry community and us for the past five years. This included studies on the mechanism behind antibiotic resistance that may help to guide the design of new antibiotic drugs to overcome such resistance. For examples, Stote et. al. studied the mechanism of a S222T mutation induced resistance of Deoxy-D-xylulose 5-phosphate reductoisomerase (DXR) to fosmidomycin using molecular dynamics (MD) simulations (5). The MD simulations revealed the structural and energetic basis of the single mutation that induced resistance shedding light on the development of a new antibiotic compounds targeting DXR. Verma et. al. recently explored the molecular mechanism of polymyxin E (colistin) resistance in mobile colistin-resistant (mcr-1) bacteria (6). Colistin is the only FDA-approved membrane-active drug to tackle Gram-negative bacteria despite its high toxicity. However, appearance of mcr-1 bacteria identified in 2015 has worsened the situation (7). MD simulations revealed the mechanism of interruption to the outer membrane of normal Gram-negative bacteria caused by colistin and dissected the mechanism of resistance in mcr-1 bacteria to the action of colistin and other cationic peptides due to the covalently attached phosphoethanolamine group modification in lipids. The simulation results provide clues for the design of new membrane disruptors to treat mcr-1 infections.

Identification and developing drug candidates against novel antibiotic targets for specific bacteria still serves as an important alternative to overcome the antibiotic resistance issue. Heme Oxygenase (HemO) is a novel antibiotic target involved in the metabolism of heme by bacteria as required to access iron. Previous bioassay data supported that Pseudomonas aeruginosa HemO (pa-HemO) inhibitors, by blocking a key mechanism of the iron acquisition system, represents a promising therapeutic target for pa infections (8, 9). Collaborating with the Wilks laboratory, our group has continued to apply CADD methods to optimize pa-HemO small molecule inhibitors. In a recent study, a series of inhibitors based on a previously established scaffold were designed and tested to develop a structure-activity relationship (SAR) (10). Binding orientations and affinities were predicted and used to interpret SAR. Good correlation between predicted affinities and bioassay potency data was observed and validated the utility of the computational model in the further refinement of the current scaffold targeting pa-HemO. In another recent study, the structure of the Clostridioides difficile (C. diff) binary toxin (CDTa and CDTb), which is associated with the most serious outbreaks of drug-resistant C. diff infection in the 21st century, was solved (11). Using normal mode analysis, we explored the possible mechanism behind the translocation of CDTa, which is the enzymatic component, helped by the CDTb that serves as the pore-forming or delivery subunit. Such analysis helps to elucidate the C. diff binary toxin infection mechanism and shape potential therapeutic strategies in the future (11).

Searching for new antibiotics against established targets are still continuing where CADD methods are playing important roles. Our laboratory together with de Leeuw and coworkers are continuing the design of novel agents against bacteria cell wall biosynthesis (12, 13). In a recent study, SAR for a series of compounds that have benzothiazole indolene scaffold was pursued targeting the essential bacterial cell wall precursor molecule Lipid II (14). Using MD simulations, we predicted binding free energies and binding modes of Lipid II binders and gained atomic details on the interactions between designed molecules with Lipid II, information that will be useful for further development of antibacterial therapeutics.

β-lactamase was used as a target in combination therapy against multi-drug resistant Enterobacteriaceae (15). β-lactamase inhibitors may help to inactivate the β-lactamase enzyme of the pathogen and restore the function of β-lactam antibiotics to overcome the enzyme-mediated resistance. Using the full Site Identification by Ligand Competitive Saturation (SILCS) (16–18) technology developed in our laboratory ( Figure 1 ), we identified β-lactamase CMY-10 inhibitors with our experimental collaborators (19). The SILCS-based CADD method was fully described in the first edition of our chapter (4) as well as another chapter previously published in this same book series (20). The de novo drug design started from running SILCS simulations, which conduct all-atom explicit-solvent combined Grand Canonical Monte Carlo/MD (GCMC/MD) simulations that include small organic solutes such as propane, benzene, methanol and others, to identify 3D functional-group binding patterns (FragMaps) on the CMY-10 protein target. Then SILCS-Pharm (21, 22) was conducted to extract important binding patterns from FragMaps and turn them into pharmacophore features at the R1 and R2 subsites of CMY-10. Pharmacophore models were then constructed and used to initiate virtual screenings (VS) against over 750,000 commercially available compounds. Top 10,000 hit compounds from the initial pharmacophore screen were selected for SILCS-Monte Carlo (SILCS-MC) sampling for further binding pose refinement and estimation of the binding affinity based on the ligand grid free energy (LGFE) evaluation. Fingerprint based similarity clustering was then conducted to maximize the chemical diversity of the ranked compounds to be selected for bioassay testing. Several compounds leading to decreased β-lactamase activity were confirmed by bioassay tests. The best hit compound was then subject to a similarity search for chemically similar analogs and more inhibitory compounds were identified. Such identified non-β-lactam-based β-lactamase inhibitors have the potential to be used in combination therapy with lactam-based antibiotics against multi-drug resistant clinical isolates.

An external file that holds a picture, illustration, etc. Object name is nihms-1860696-f0001.jpg

SILCS oriented CADD workflow developed in our laboratory and used in the CMY-10 project (19). Wet-lab and CADD techniques are colored in red and blue, respectively. Boxes with solid lines indicate methods used in the CMY-10 study while boxes with dashed lines mark methods not used in the CMY-10 study but in other studies. Double arrows indicate the two techniques can be used interactively in several iterative drug design rounds.

With the fast development of more powerful computing hardware, expensive algorithms such as free energy perturbation methods (23), which can only be used to finely tune the drug candidates at the lead optimization stage, become much more affordable and have been routinely used in a range of applications (24–26). Alternative CADD methods represent novel solutions that exploit the interactions between drugs and targets are also seeing wider use. Our laboratory put forward the SILCS methodology as described previously, and information from SILCS can be utilized in many different ways in various aspects of drug discovery (16–18). Significant advancements are developments in machine learning (ML), especially deep learning (DL) based CADD algorithms (27) owing, in part, to the development of artificial intelligence (AI) methods in other areas (28).

ML algorithms are not new to the CADD area, but the increasing need for AI in areas such as image recognition and text processing promote powerful novel ML algorithms that can handle vast amount of data (29, 30). The refined graphic processing unit (GPU) architecture (31) and its growing computing power further accelerate the applications of ML, and its adaptations in CADD has erupted in recent years. This includes quite a lot of antibiotic drug development studies employing ML (32, 33). For example, Palsson et. al. developed an ML workflow for identifying genetic features driving antibiotic resistance (34). ML models were trained against the resistance profiles of 14 antibiotics across three urgent pathogens using genome sequences as inputs. The ML workflow was verified to be able to generate models not only capable of predicting resistance profiles but also identifying the responsible genes. In another study (35), Collins et. al. conducted an antibiotic activity assay screen of near 2,300 chemically diverse FDA approved and natural product compounds targeting E. coli. Deep neural network-based DL models were then trained to predict the inhibition probabilities from the chemical structures and properties of tested compounds alone. The resulting DL model was used to screen the Drug Repurposing Hub database (36) and a known c-Jun N-terminal kinase inhibitor SU3327 was predicted to be an antibiotic targeting E. coli. This molecule is structurally divergent from conventional antibiotics and was confirmed to display bactericidal activity against a wide phylogenetic spectrum of pathogens, demonstrating how ML can guide the antibiotics discovery.

In the rest of this chapter which serves as an update to our first edition, recent progresses in our laboratory toward development of novel SILCS based CADD methods will be overviewed. Typical ML method will also be covered. Readers are highly recommended to refer to the first edition of this chapter (4) for basic CADD concepts and classical protocols to gain a fundamental understanding about CADD methods towards antibiotics development.

2. Materials

Similar to other computational sciences, the two basic materials in CADD are the specific hardware and software that are suitable for the current study of interest. The hardware, which refers to the computational resources, can be established locally, for example, computer clusters being purchased and equipped in the working place, or obtained on-the-fly, e.g., computing times applied from public supercomputer resources such as XSEDE (37) or purchased from private companies such as popularized cloud computing on the Amazon Web Services (AWS) or Microsoft Azure cloud platforms (38). On the other side, software requirements are varied depending on the specific study goals. In the first edition of this book, we introduced fundamental tools for CADD. Here, an update is provided with an emphasis on the common CADD tools used in our laboratory.

MD simulation packages such as CHARMM (39), GROMACS (40), NAMD (41) and OpenMM (42) among others, are continually being optimized. Better computational performance is reached through algorithm refinement and software engineering as well as optimized computing using GPUs (41–44). New MD programs developed in the GPU era are also emerging and get more attentions. For example, ACEMD (45) which was optimized for use on Nvidia GPUs maximizes its performance by running the full computation on GPUs rather than dividing the job between CPUs and GPUs.

Target structures are required for SBDD method and can be downloaded from the Protein Data Bank (PDB) (46) if it has been solved by X-ray crystallography, nuclear magnetic resonance (NMR) or recently matured cryogenic electron microscopy (cryo-EM) techniques (47). For unsolved protein targets, 3D structures can be predicted using the recently released RoseTTA fold from Baker’s group (48), and AlphaFold ML model from the Google DeepMind team, which was verified to have the best accuracy among other protein structure prediction methods (49). For most proteins, predicted structures using the AlphaFold ML model are available to be downloaded from the server hosted by the European Bioinformatics Institute (https://alphafold.ebi.ac.uk/) (50, 51).

Force fields, which are used to estimate the energies and forces within and between molecules, continue to be refined. This includes the CHARMM (52–55) or AMBER (56, 57) families among others to describe both macromolecules, such as the CHARMM36 protein force field, (52, 53) and small molecules such as the CHARMM General Force Field (CGenFF) (54, 55). To automate the creation of the topologies and parameters for new molecules, program like CGenFF program (see https://cgenff.paramchem.org) (58, 59) can be used. And for experts who want to further optimize force field parameters, a standalone package named FFParam is available for CHARMM force field parametrization (60). In addition to the additive force fields that have a long history, emerging polarizable force fields such as the CHARMM Drude force field (61, 62) and AMOEBA (63) are now available that treat electronic polarization effects explicitly thereby describing the interactions between molecules more realistically. The increased computational cost introduced by polarization terms (~4-fold over the additive model with the Drude FF) is gradually being overcome by better computing algorithms and growing GPU ability (64, 65). Accordingly, it may be anticipated that polarizable FFs will see routine use in the near future to describe interactions between antibiotics and bacteria targets in CADD. As a further note, new types of force field based on different perspectives, such as Open Force Field (66, 67) that is coded by direct chemical perception instead of predefined atom types for atoms; or driven by ML such as PhysNet (68, 69), which is based on deep neural networks, are also emerging, even though their capabilities need to be thoroughly tested before their broader use in CADD applications.

Virtual database screening (VS) is used to screen large chemical libraries to search for potential small molecule binders for a given macromolecule target. CADD methods such as docking (70) or pharmacophore modeling (71) can be adapted for this purpose. For docking, both free software, such as AutoDock Vina (72) and commercial ones such as GOLD (73) are available among others (74). Opensource toolkits with interface to existing docking software are also available to facilitate a more integrated docking based CADD cycle. For example, the open drug discovery toolkit (OOTD) (75), which is currently interfaced with AutoDock Vina, offers researcher the ease of conducting the full docking workflow including in silico compound library preparation, library filtering, docking pose rescoring, docking performance evaluation and SAR model training. Another example is VirtualFlow (76), which has similar functions as OOTD but offers an interface to additional docking programs and is built on an optimized architecture to enable efficient parallelization and balanced workload for a better docking job performance against huge chemical libraries. Web-based docking platforms are also available for a convenient use even for non-experts. For example, Webina (77) offers users the ease of using AutoDock Vina on the web without installation. Another web based docking interface, SeamDock (78), allows users to select from four docking codes for their docking needs and also provides the ability to share visualization of docking results with other researchers. For structure-based pharmacophore modeling, the open-source program Pharmer (79) can perform pharmacophore searching efficiently on large databases. It also provides a web interface ZINCPharmer (80) for interactive environment for the virtual screening of the ZINC or Molport databases using pharmacophores and later the same research group launched another web service called Pharmit (81) for online pharmacophore VS using user tailored or a variety of pre-loaded databases. It should be noted that for pharmacophore searching, multiple conformations of each molecule in the database are required as well as assignment of the correct protonation and tautomeric states, with the latter requirement true for all in silico databases used for either docking or pharmacophore screening.

VS uses chemical libraries to identify small molecules to be tested in biological assays for ligand discovery. While researchers can resort to in-house compounds based on their own chemical synthesis work, purchasing compounds from commercial chemical vendors is a convenient way to assist the discovery at the early stage. ZINC (82) as well as MolPort (83) provide such platforms for chemical compound sourcing from various vendors. For de novo drug design, exploration of larger chemical space holds the promise for higher success rates in general. The ultra-large REadily accessible (REAL) (84) compound library from Enamine represents for the largest purchasable chemical collection currently available. Its REAL Database currently contains 4.1 billion enumerated compounds and can be extended to over 20 billion compounds in the Enamine synthetically accessible database called REAL Space, for which compound synthesis time is ~8 weeks from order to delivery with an average of over 80% of the requested compounds actually being synthesized and delivered. As an alternate to de novo drug development, drug repurposing is a lower cost method which explores existing therapeutics for a new disease indication (85, 86). For this purpose, the library of FDA approved drugs can be downloaded from various sources, e.g. as a subset from ZINC (82). Another comprehensive library of clinical compounds, called Drug Repurposing Hub (36), provides a hand-curated collection of thousands of approved and in-clinic compounds with annotated identities is also available.

Integrated commercial software such as Discovery Studio (87) and MOE (88) among others, incorporates a broad range of CADD capabilities. On the open source side, even though quite a lot of choices are available for specific CADD needs, integrated code is rare. However, the commercial packages such as OpenEye (89) offers no cost license for pure non-commercial research while others provide discounted licensing for academic use. Similarly, the software suite from SilcsBio LLC (90), which offers end to end drug design capabilities in the context of the SILCS technology is available at no cost to non-commercial research groups. Online platforms that offer integrated CADD capabilities without the hassle of installing on your local machine and do not require advanced computer knowledge are also emerging. One example is PlayMolecule (91) from Acellera, which offers CADD workflows covers target preparation, binding site identification, force field parametrization, MD simulation, docking as well as ML model generation on the cloud. Most of these services are free to the public with some limitations and full service is also available for purchase. Traditional CADD software companies are also in the transition to offer cloud services, such as OpenEye (89) launched Orion platform to offer web-based environment for their software.

3. Methods

In addition to the common methods introduced in the first edition of this chapter, additional CADD methods developed recently in our laboratory as well as from other laboratories will be described below.

3.1. Protein structure prediction using AlphaFold

For SBDD, protein 3D structure is required to explore atomic level details of the ligand-protein interactions. When no protein structure is available from the PDB, structure prediction methods such as homology modelling (92) were used traditionally to generate 3D models. With the surging of AI and related DL techniques, DL driven structure prediction methods such as RoseTTA fold (48) and AlphaFold (49) can now predict most protein 3D structures to a level of approaching experimental accuracy. In the recent challenging 14th Critical Assessment of protein Structure Prediction (CASP14), AlphaFold was demonstrated to greatly outperform other methods, and its predictions are competitive with experimental structures in a majority of cases. This promising progress make the initiation of SBDD methods much more feasible. Below are general steps to prepare a protein structure using AlphaFold.

Go to the AlphaFold database hosted by the European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) at https://alphafold.ebi.ac.uk and input the protein name, gene name or UniProt accession name in the search bar. The DeepMind team has already predicted the structures of most known human proteins as well as those of 20 model organisms (50, 51) including bacteria such as Escherichia coli and Staphylococcus aureus and deposited them to the EMBL-EBI server.

Click on the most relevant entry from the search hit list from the results page if the search is conducted using text other than the UniProt accession ID. On the next page, the predicted 3D structure is displayed with residues colored according to the predicted local distance difference test (pLDDT) metric and the predicted aligned error (PAE) matrix is also shown.

Check the prediction quality by looking at both the pLDDT metric, which is per-residue confidence metric that reflects the local confidence in the structure, and the PAE metric which can be used to assess the confidence in the relative orientation of different parts, e.g. domains, of the model.

The predicted structure is then downloaded in PDB or mmCIF format to users’ machine for further analysis. For example, in some cases the predicted structure covers the full length of the sequence, but the user may want to only focus on specific domain of the protein for drug design purposes. In such cases the downloaded structure can be trimmed for subsequent use. For regions with lower pLDDT values, MD simulation can further be conducted to equilibrate and refine the structure.

For proteins not yet deposited in the AlphaFold database, the 3D structure can be predicted using the AlphaFold code that may be downloaded from AlphaFold GitHub deposit at https://github.com/deepmind/alphafold/. Follow the README file there to install required environment and load AlphaFold program. Prepare a FASTA file of the sequence of the protein to be predicted and input into the python script to run the prediction. Following completion of the prediction, the output structures are saved in a subdirectory provided by user via the ‘--output_dir’ flag of using the python script. It should be noted limitations in the true resolution of the structures from AI-prediction methods exist and that these methods do not account for the presence of alternate conformations of the protein (e.g., allosteric states) which are need to be considered when using 3D structures generated from these methods (93).

3.2. MD simulations with polarizable force field

MD simulations model how atoms in a protein or other molecular systems will move over time based on force field description (94) through integration of Newton’s equations of motion. These simulations can capture a wide range of important biomolecular processes such as conformational change and drug binding, where the dynamics of the systems allows for the inclusion of the entropic contributions to ligand binding to be taken into account as required for calculating free energies. Accordingly, the rich information from MD simulation acts as the foundation for other CADD techniques such as FEP and SILCS developed in our laboratory. Big improvements in simulation speed, accuracy, and accessibility of MD simulation software and environment have increased the utility of MD in CADD. Beyond the classic additive force fields (52–57) currently used in the majority of MD simulations, polarizable force fields that explicitly account for induced electronic polarization represent the next generation of physical models for MD simulations (95–101). Our laboratory studied the impact of electronic polarizability on protein-fragment interactions using the in-house developed classical Drude oscillator model, showing that the polarizable force field helps to improve the prediction of protein-ligand interactions indicating the utility of a polarizable force field in CADD (102). The Drude oscillator FF models electronic polarization by attaching a charged particle to the nucleus of each non-hydrogen atom via a harmonic spring and allowing those particles to relax in the surrounding electric field with the nuclear position fixed, as previously described. While the additional terms introduced in the force field associated with the treatment of polarization increase the computational cost, improved algorithms and computational power make this class of simulations accessible (64, 65). For example, the computational overhead of the Drude FF over the additive CHARMM FF is approximately 4-fold (64, 65). Thus, the Drude as well as other polarizable FFs represent new tools to study molecular systems that will make significant contributions to CADD.

In the first edition of this chapter, we introduced a standard MD simulation protocol. Here we present MD simulation protocol using the Drude polarizable force field.

Obtain the protein structure from PDB or predict the protein structure as described above in section 3.1. Prepare the protein structure for MD simulations by adding missing hydrogens, assigning appropriate protonation state of residues and etc. These steps can be performed by a number of the publicly available and commercial modelling packages as discussed above. Generate CHARMM protein structure file (PSF) file for the simulation system based on CHARMM additive force fields using web tool CHARMM-GUI (http://www.charmm-gui.org) (103) or locally by running the CHARMM code. The CHARMM-GUI may be used for initial protein preparation as well as for preparation of the PSF and is available to non-commercial users.

Go to the CHARMM-GUI at http://www.charmm-gui.org and select Drude Prepper (104) and upload the additive PSF and coordinate files to construct the Drude force field based PSF and PDB files with added Drude particles and lone pairs. Also provided are the CHARMM input files for subsequent calculations listed below and the needed topology and parameter files. In addition, the user may request input files compatible with MD programs such as OpenMM and NAMD that support the Drude FF.

Similar to the protocol previously described for the additive FF (4), the system will go through minimization, equilibration and production steps. During minimization, Drude particles will be minimized first and then the entire system using the adopted-basis Newton-Raphson (ABNR) minimizer.

For equilibration and production runs, a hard wall restraint of 0.2 Å between the parent atom and the Drude particle is applied to prevent instability and large displacements of Drude particles. The hard wall is designed to avoid polarization catastrophe that may occur due to low frequency close interactions between atomsduring the MD simulation leading to over polarization (105). During Drude simulations, the extended Lagrangian dynamics scheme (106) for integration of Newton’s equations of motion is used where the real atoms and the Drude particles are coupled to a dual thermostat responsible for uniting their dynamics. The physical and Drude thermostats are maintained at different temperatures of 298 K and 1 K with the friction coefficients of 5 ps −1 and 20 ps −1 , respectively. Drude simulations are typically propagated with a 1 fs time step.

Analysis of Drude simulations, beyond that used for all MD simulations, can include variations in the dipole moments of various groups in the systems being studied. This allows for an understanding of how variations in the electronic structure of the system associated with the explicit inclusion of polarizability are impacting the properties of the system. When calculating dipole moments, care must be taken as the dipole is not spatially invariant when the sum of the charges is not zero. To account for this and facilitate dipole analysis with the Drude FF, the sum of the charges on all particles are integers (e.g., on protein sidechains and nucleic acid bases) though the spatial orientations of charged groups must still be considered.

3.3. Docking using SILCS-MC and ML based reweighting for SAR

Docking is a useful CADD tool to predict binding orientation of a ligand molecule within target binding site as well as to evaluate its binding strength (70). Traditional docking methods only consider rigid or limited protein flexibility and ignore or treat the contribution of desolvation to binding in an empirical way. While FEP and molecular mechanics (MM) with Poisson–Boltzmann (PB) and surface area solvation (MM/PBSA) as well as MM/generalized-Born SA (GBSA) methods (107) do account desolvation, these are computational demanding approaches that limits their utility in CADD. A novel method designed to overcome this drawback is the SILCS-MC (19) docking method put forward by our laboratory. SILCS-MC conducts ligand sampling within the GFE FragMap free energy grids from SILCS. This takes advantage of the use of GCMC/MD simulations of the protein in aqueous solution with selected organic solutes to precompute the GFE FragMaps that are free energy functional group affinity patterns that encompass the entire protein and account for protein flexibility and desolvation contributions (108–110). SILCS-MC then involves simply assigning the GFE value for the appropriate FragMap type to each atom in the molecule and summing those values to get the LGFE score. MC conformational sampling is then performed to allow the orientation and conformation of the ligands to relax in the field of the GFE FragMaps. This allows for SILCS-MC docking to be performed in a highly computationally efficient fashion while achieving a level of accuracy similar to highly expensive FEP methods (109). The SILCS method was fully described in the first edition of this chapter (4). Below we present the SILCS-MC docking protocol assuming the user has already run the SILCS simulations and obtained the GFE FragMaps. A Bayesian ML based reweighting protocol is also described for improvement of the predictability of the SILCS method that can be applied when experimental data on a small set of ligands (10 or more) is available (108).

Prepare molecule coordinate files for ligands to be docked in either mol2 or sdf format. For mol2 format, each mol2 file contains a single molecule, while for sdf format, multiple molecule entries are allowed in a single sdf file by the current SILCS-MC code.

Select an atom classification scheme (ACS) for the SILCS-MC run. When performing SILCS-MC, GFE is evaluated for each atom in the molecule with an assigned type that overlapped with SILCS FragMaps of the same type. ACS controls the assignment of FragMap types to each atom in a molecule based on their CGenFF atom type and chemical connectivity during the initiation of a SILCS-MC run. Typical ACS include generic and specific types. Generic ACS has more general FragMap types, e.g., both aromatic and aliphatic carbon atoms in a molecule will be assigned with generic nonpolar GENN FragMap type. While specific ACS has specific FragMap types being assigned to specific atoms, e.g., aromatic carbon atom has BENC (benzene carbons) FragMap type while aliphatic carbon gets PRPC (propane carbons) type. The generic ACS is the default method.

Choose a MC sampling protocol for the SILCS-MC run. Ligand binding poses are sampled using Metropolis MC sampling and following simulated annealing (SA) during a SILCS-MC run. Thus, MC/SA parameters such as simulation cycles (n_CY) and steps (n_MC/n_SA) as well as range of global rotational (dθ), translational (dX) and intramolecular dihedral (dφ) degrees of freedom can be adjusted depending on the specific system. Typical protocols include local and exhaustive types even though users can customize their own protocol by changing corresponding parameters in the SILCS-MC input file. Local MC is designed for pose refinement and the sampling starts from the user supplied pose with limited conformational sampling with n_CY=10, n_MC=100, dX=0.5 Å, dθ=15°, dφ=45° and n_SA=1000. Exhaustive protocol is designed for full docking of the ligand orientation and conformation in a given pocket to determine its most favorable orientation when no initial binding information is available from experiment. It starts with a randomized orientation for the ligand within a sphere with user defined center and radius. MC sampling is performed to allow for larger conformational changes with n_CY=250, n_MC=10,000, dX=1 Å, dθ=180°, dφ=180° and n_SA=40,000. For both local and exhaustive sampling, SA steps following MC in each cycle adapts parameters as dX=0.2 Å, dθ=9° and dφ=9°.

Run SILCS-MC simulations using the SILCS-MC code with the ACS file, the CGenFF rules and parameter files, the GFE FragMap files, exclusion map file and user defined parameters. CGenFF parameters are initially assigned to the ligand powered by the CGenFF engine to allow for energy minimization of the ligand during initiation of SILCS-MC simulation and used to calculate intramolecular energy during the MC calculation. The exclusion map represents the forbidden region of the protein not sampled by the solutes or water non-hydrogen atoms during the SILCS simulation and used as a penalty score to guide the sampling. Usually, five independent SILCS-MC runs are conducted in parallel to expediate the convergence of the docking results. In each run, after a minimum of 50 MC/SA cycles, if the lowest three LGFE scores are within 0.5 kcal/mol, the run will be considered converged and terminated. Otherwise, cycles will continue either until the convergence criterion is met or until the user defined maximum MC/SA cycles, 250 by default, have been reached.

After the SILCS-MC simulations are finished, the docking pose with the lowest LGFE score can be extracted and used as the predicted binding orientation for the ligand. The docking pose can be visualized together with the protein structure and FragMaps and analyzed. One advantage of SILCS-MC over traditional docking methods is the ease of decomposition of total docking score into atomic contributions which are especially useful during the ligand optimization step (111, 112). Atomic GFE values can be visually checked for a ligand to determine beneficial and unsatisfactory functional groups. For example, when modifying a ligand, favorable gains associated with the modification may be offset by a loss of favorable contributions in another part of the molecule, information that is not readily accessible to other CADD docking methods (109). In addition, visualization of FragMaps around the docking pose of a ligand can also offer ideas about additional functional groups that may be introduced to the current scaffold to further improve affinity.

SILCS Bayesian ML reweighting: When experimental binding data is available, LGFE scores can be trained using ML for a refined prediction yielding a more accurate SAR model. The LGFE is a simple summation over all atomic GFE contributions from different FragMap types, assuming the contribution from each FragMap type is well balanced when fragments form a full molecule. In practice, this represents an approximation since the sum of binding affinities of individual fragments in a molecule does not formally equal the binding affinity of the full molecule due to the energy adjustment through linking fragments into a molecule. Accordingly, the GFE FragMap contributions in LGFE can be reweighted based on experimental binding data to improve the predictability of SILCS-MC. This is done by using a Bayesian Markov-Chain Monte Carlo based ML (BML) method (108).

To start the reweighting BML process, experimental data and SILCS-MC docking poses in PDB format are required. The ML training can be conducted by optimizing the root-mean squared deviation (RMSD), Pearson correlation or percent correct (true positives and true negatives) metrics between the LGFE and experimental binding free energies. The user can also select from three restraint types as flat-bottom, hard wall and harmonic to prevent over-fitting problem. Running the BML code will yield trained weighting factors for each FragMap type and estimated prediction improvement based on the current docking poses. The optimized weighting factors of the FragMaps are then used to redo the SILCS-MC run to verify the real improvement. The new LGFE score formula with trained reweighting factors can then be used for new ligand designs for the current protein target. In cases where overfitting of the weighting factors occurs, the resulting docking poses and LGFE scores from the second SILCS-MC run get highly perturbed and in poorer agreement with experiment, respectively, allowing for a check on the applied BML fitting parameters.

3.4. Site identification using SILCS-Hotspots

Computational binding site identification methods can be used to exploit novel, druggable sites on new protein targets for potential therapeutic development (113). For antibiotics development, such methods can be employed to search for putative allosteric sites as alternatives to the active or orthosteric sites on bacterial proteins to overcome drug resistance issues (114). A binding-site identification method under the SILCS framework, named SILCS-Hotspots, was developed recently by our laboratory (115). SILCS-Hotspots is designed to identify fragment binding hotspots that are spatially distributed across the global protein structure including both surface and interior binding sites. The general protocol using SILCS-Hotspots to identify putative binding sites on a protein is described as the following. The protocol requires that the SILCS FragMaps are already available.

Select a collection of representative molecular fragments to be used for the hotspots search. The Astex MiniFrag set (116) and the collection of ~90 mono and bicyclic rings present in drug molecules (117) are both good fragment libraries to be used.

Partition the protein system into a set of overlapping 14.14 Å 3 sub-spaces that encompass the entire protein. For each individual sampling box, exhaustive SILCS-MC as described in section 3.3 is conducted for every fragment in the library. All SILCS-MC docking poses that are sampled over the full space are collected for each fragment. Fragment docking poses with LGFE scores of −2 kcal/mol or more favorable and within 6 Å of any protein C⍺ atoms are selected as relevant binding poses and subjected to the following clustering steps.

For each fragment, a center-of-mass (COM) based clustering with 3 Å cluster radius is performed. Clustering determines the number of neighbors within a 3 Å radius of COM of each docking pose and then identifies the pose with the largest number of neighbors. The remaining cluster members are then removed from the pool of docking poses with the process continued until no additional poses remain. This step selects presentative docking poses for each fragment.

A second round of clustering is conducted over docking poses of all fragments obtained from the first round of clustering. The same clustering algorithm is used but with a radius of 4 Å, from which clusters that contain representative docking poses for one or more fragments are identified. These cluster centers are defined as Hotspots. Information on the hotspots include the number and types of fragments in the site, the LGFE scores of all fragments, and their spatial relationships. Hotspot ranking may then be performed based on the average LGFE scores or the number of fragments in a site.

As previously discussed, top ranking sites based on quantitative criteria do not always correspond to known binding sites of drug-like molecules including allosteric modulators (115). Rather, visual inspection of the hotspots is undertaken to identify those in which two or more hotspots are adjacent to each other as require to covalently link fragments occupying each site to create drug like molecules. This qualitative selection of sites is facilitated by analysis of the SILCS FragMaps to identify sites with apolar characteristics indicative of hydrophobic forces driving binding, along with FragMaps representative of polar groups. In addition, analysis of the SILCS exclusion map allows for identification of regions of the protein that open between Hotspots, allowing for chemically linking fragments in those sites that is not evident from analysis of the solvent accessible surface. Figure 2 shows an example of representative sites from this qualitative selection process for the bacterial enzyme TEM-1 beta-lactamase (118, 119). Evident is that the sites contain favorable hotspots including the presence of apolar FragMaps (green), as such interactions are important to drive ligand affinity, along with multiple polar FragMaps such as H-bond donor and acceptor types that can contribute to specificity (e. g. sites 1–3 in Figure 2 ). All the qualitatively selected sites can be further evaluated quantitatively.

An external file that holds a picture, illustration, etc. Object name is nihms-1860696-f0002.jpg

SILCS-Hotspots analysis for the bacterial enzyme TEM-1 beta-lactamase using the Astex MiniFrag set. SILCS apolar (green), H-bond donor (blue) and H-bond acceptor (red). FragMaps are rendered at −1.0 kcal/mol while positively charged (cyan) and negatively charged (orange) FragMaps are rendered at −1.2 kcal/mol. Hotspots are shown as spheres with average LGFE colored in red-white-blue (more to less favorable) scale. Putative binding sites selected based on adjacent hotspots, FragMaps and exclusion maps are shown in red dashed circles. Crystal binding modes of an active site (PDB:1ERM) (118) and an allosteric site (PDB:1PZP) (119) binder are shown. rSASA% vs LGFE plots are shown in the lower panel for top 25 LGFE ranked FDA compounds for all three sites with average values indicated as vertical and horizon lines.

Quantitative evaluation may be performed through exhaustive SILCS-MC docking on each selected site using a library of drug molecules, for example, FDA approved drugs. In house, ~ 380 chemically diverse FDA approved compounds were constructed for this purpose. Exhaustive SILCS-MC docking is performed with the center of each site defined based on the central hotspot along with a 5 Å radius; the process may be repeated with each hotspot within an interesting site as the center of the docking region. For each site, the average LGFE scores of the top-ranked 25 FDA compounds based on the LGFE scores are obtained along with the percent relative solvent accessible surface area (rSASA%) (120). The rSASA% is calculated using the solvent accessibility of each ligand in the presence and absence of the protein. Free software such as FreeSASA can be used for such a calculation (121). The combination of these metrics is then used to quantify the binding sites with ideal sites giving highly favorable LGFE scores (

3.5. Membrane permeation prediction using SILCS

Most antibiotics were designed to target proteins involved in intracellular processes, thus the outer membrane of bacteria needs to be penetrated for antibiotics to function. Drug resistance involving modifications of macromolecules in the outer membrane is a common issue that needs to be considered when searching for new antibiotics (122, 123). While bacterial membranes are complex environments with multiple transport and pore proteins, it is of utility to estimate the pure membrane permeability of drug candidates during drug discovery as this may contribute to drug bioavailability. Traditionally, potential of mean force (PMF) free-energy profiles for a compound across membrane lipid bilayers are derived using MD simulations (124). The PMF may then be used together with position-specific diffusion coefficient in the inhomogeneous solubility-diffusion equation (125) to derive effective resistivity, which may be inverted into permeability. Under the SILCS framework, we recently put forward a protocol to calculate permeation related resistant factor of a molecule to cross membranes (126) using LGFE energy profile and is described in the following.

Setup the membrane lipid bilayer system. This can be a bilayer system with lipopolysaccharide composition that is specific for the bacteria outer membrane of interest (127), or just a bilayer model. Examples include pure 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC), a (0.9:0.1) 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine(POPC)/cholesterol mixture or a (0.52:0.18:0.3) 1,2-dioleoyl-sn-glycero-3-phosphocholine(DOPC)/1,2-dioleoyl-sn-glycero-3-phospho-l-serine(DOPS)/cholesterol composition that mimics the lipid mixture used in a parallel artificial membrane permeability assay (PAMPA) experimental study (128). The membrane builder functionality (129) in CHARMM-GUI is a very convenient tool to setup such lipid bilayer systems. Minimization and short MD simulation can be conducted to further stabilize the lipid bilayer model using protocol and inputs supplied by the CHARMM-GUI.

Perform the standard SILCS simulation on the lipid bilayer system and generate the GFE FragMaps.

Calculate the LGFE profile for drug-like ligands across the lipid bilayer. Run SILCS-MC for the ligand along the normal Z to the bilayer with 1 Å increments covering the full bilayer system including both the lipid and water phases. At each Z position, SILCS-MC is performed under exhaustive mode as described in section 3.3 except for that the ligand COM is only allowed to vary by 1 Å maximum from the assigned Z value during MC sampling. SILCS-MC simulation can be conducted at multiple different (X,Y) positions along the plane of the bilayer to ensure proper samplings. LGFE profile is constructed at each (X,Y) position along Z and multiple energy profiles are averaged over different (X,Y) positions to get the final LGFE profile with standard deviation being evaluated.

Use the LGFE energy profile G(z) along Z axis to calculate permeation related resistant factor R. The effective membrane permeability P_eff can be calculated from the effective resistivity R_eff through equation: 1/P_eff=R_eff=∫_h R(z)dz, where h is the bilayer thickness and resistivity R(z) at position z is defined as: R(z)=e β(ΔG(z)) /D(z), where D(z) is the position-specific diffusion coefficient at position z and ΔG(z)=G(z)-G_ref. Here, G(z) is LGFE profile as a function of z and G_ref is the reference free energy in the water phase that can be calculated from the average LGFE over the water phase. β=1/k_BT, where k_B is the Boltzmann constant and T is the absolute temperature. In the current protocol, D(z) is not calculated and is assumed to be a constant D so that 1/P_eff=R/D where R=∫_h e β(ΔG(z)) dz will be calculated as the resistant coefficient.

3.6. Protein-protein interaction prediction using SILCS-PPI

Protein-protein interactions (PPIs) are involved in a tremendous amount of vital cellular processes in bacteria and can serve as novel antibiotic targets (130, 131). Efforts towards the inhibition of PPIs related to division and replication, transcription, outer membrane protein complexes, as well as toxin-antitoxin systems in bacteria are ongoing (132). Thus, PPI prediction is of utility to identify novel druggable PPI interfaces in bacterial proteins and pave the way toward novel antibiotic therapeutics. Traditional PPI prediction methods are mostly based on rigid protein structures with limited flexibility considerations (133). Using the GFE FragMap and protein residue occupancy distributions, or protein probability grids (PPG), calculated from SILCS, a PPI prediction method named SILCS-PPI was put forward in our laboratory (134). It uses SILCS FragMaps and PPG from both proteins involved in a PPI which have flexibility considerations intrinsically embedded in them together with fast Fourier transforms (FFT) enhanced sampling to sample a comprehensive set of PPI interaction orientations that are then ranked based on the overlap of the FragMaps and PPG of the protein partners. The general SILCS-PPI protocol is described as the following and requires that the SILCS FragMaps and PPG for both proteins are already available.

Run SILCS-PPI prediction using both FragMaps and PPGs as well as exclusion maps from both proteins. During the run, FragMaps from the ligand protein will be spatially operated to match PPGs from the receptor protein and vice visa. To expedite the process, unique rigid body rotations (135) are considered for the ligand protein and for each rotation FFT is used to calculate PPI scores for all global translations in one go. The final SILCS-PPI score is obtained by summing over all ligand FragMap-receptor PPG scores and receptor FragMap-ligand PPG scores of all types as well as an exclusion score calculated from the correlation of exclusion maps from the two proteins which serves as an alternative shape complementary score.

Save top ranked solutions (global translation and rotation parameters) and construct PPI complex coordinates. Then two-pass clustering is used to cluster all complex models. In the first step, COM-based clustering is conducted to put all models whose COM distances are within 6 Å into the same cluster. Next, a second orientation-based clustering is performed using an angular distance metric that preserve periodicity (136). The distance cutoff is set to 0.5, which corresponds to about 30° in an angle. After the two-pass clustering, the best scoring pose from each cluster is saved for further evaluation.

COM of top ranked solutions can be visualized on the surfaces of both ligand and receptor proteins and colored by SILCS-PPI scores to help interpret the predicted PPI interfaces. The populations of COMs on the protein surface can be used to predict alternative PPI sites. In addition, the PDB coordinates of the PPI complexes may be accessed, though they are based on the rigid crystal structures used to initiate the SILCS-PPI calculation such that there will typically be steric overlap between the two proteins that requires careful relaxation of the structures prior to MD simulations.

3.7. Biologics formulation using SILCS

Besides efforts to develop small molecule antibiotics to counteract the evolving drug resistance of bacteria, researchers are also applying biologics-based drugs such as monoclonal antibodies (mAbs) in the battle (137–140). Biomacromolecular therapeutics, or so called biologics, need to be carefully formulated to maximize protein stability and minimize viscosity, so as to ensure both efficacy and safety for highly concentrated formulations (141). Toward maximizing stability, biologics can be formulated with excipients to help minimize aggregation and denaturation of the biologic in a solution formulation (142). To assist the rational selection of excipients for biologics, we developed the SILCS-Biologics protocol (143, 144) which combines SILCS-PPI and SILCS-Hotspots as described above to predict both PPIs that can contribute to protein aggregation and increased viscosity, and binding sites of excipients. This information is then combined to build a model for protein stability, aggregation and viscosity prediction. Basic protocol is shown in the following.

Run SILCS simulation on the biologic protein and generate both FragMaps and PPGs as described in section 3.6.

Predict PPI sites using SILCS-PPI as described in section 3.6. Instead of two proteins, here the SILCS-PPI calculation is conducted against the same set of FragMaps and PPGs from the single protein. After the two-pass clustering in SILCS-PPI, all selected poses are used to calculate a per-residue PPI preference value (PPIP) by counting the number of contacts between the receptor and ligand (the same protein) atoms within a 5 Å cutoff over all poses and normalized by the maximum PPIP value to get the final PPIP score for all residues that contribute PPI. Such PPIP score suggests the likelihood of a residue being involved in a PPI that may lead to aggregation or increase viscosity.

Run SILCS-Hotspots to map excipient binding sites on the biologic protein. The user can choose a collection of excipient molecules desired for the formulation. In our in-house tests (143), amino acid and sugar excipient molecules used include alanine, arginine, aspartate, citrate, glucose, glutamate, glycine, histidine, lactate, lysine, malate, mannitol, phosphate, proline, sorbitol, succinate, sucrose, threonine, trehalose, and valine. This list may be easily altered or extended as needed.

Combine the calculated PPIP from step 2 and excipient binding-site profiles from step 3 to investigate the potential effect of excipient molecules on biologic protein aggregation. For example, the number of excipient binding sites that satisfy a range of PPIP and energy criteria may be selected. These may then be partitioned into the number of sites involving individual excipients. In addition to LGFE, ligand efficiency (LE), which is defined as LGFE divided by number of non-hydrogen atoms in a molecule, is also employed to rank excipients since it is independent of the size of the excipient molecules. In a study on the NIST mAb, it was found that a criterion defined as number of excipient binding sites that have average LE < −0.25 kcal/mol and PPIP >0.1, correlates well with experimental viscosity profile in general (143). Such criterion has the ability to indicate the strength of excipient to prevent aggregation since it incorporates information about favorable excipient binding (more negative LE) against more likely aggregation involved regions (higher PPIP). In practice, user can try different criteria using PPIP, LGFE and LE metrics for biologic protein of interest and even build a regression model using these metrics if there is experimental aggregation related data available. We note that given the challenges associated with biologics formulations, it is likely that the different criteria will be required for different proteins, even with different mAb molecules of the same class.

4. Notes

For protein structure prediction using AlphaFold, regions with low pLDDT values are often intrinsically disordered regions (50), which are generally not suitable for drug targeting purposes. The intrinsically disordered regions are often presented as extended polypeptide regions in the predicted 3D structures. If a well-structured region in your model has low pLDDT values, this then might indicate that the quality of the model is questionable and needs to be examined.

Docking pose of a ligand from a SILCS-MC run can have clashes with the protein structure that is used to initialize the SILCS simulation. This is because SILCS-MC docking use SILCS FragMaps that incorporate the protein flexibility during the MD simulation. An alternative is to visualize the docking pose with the SILCS exclusion map which can serve as a flexibility-accounting alternative to the protein surface representation based on a single rigid protein structure. To present the SILCS-MC docking pose in a classic protein-ligand interaction representation fashion, it is also practical to extract protein structures from the SILCS simulation that have no clashes with the pose to present the result. Finally, when combining the SILCS-MC predicted ligand orientation with the protein structure used to initiate the simulation or one extracted from the SILCS simulations, it is important to perform careful relaxation of the protein around the ligand prior to production MD simulations.

In the current version of SILCS-PPI, only protein structures that used to initialize SILCS simulations are used to construct PPI complex coordinates. In practice, representative protein structures from SILCS simulations can be extracted for model construction purpose and minimization with short time MD simulation is also desired to further refine the complex model for a better PPI representation.

Acknowledgements.

This work was supported by NIH grants R35GM131710 (AM), GM129327 (DW), AI152397 (DW), the University of Maryland Center for Biomolecular Therapeutics (CBT), the Samuel Waxman Cancer Research Foundation, and the Computer-Aided Drug Design (CADD) Center at the University of Maryland, Baltimore.

Footnotes

Conflict of interest: A.D.M. is Co-founder and CSO of SilcsBio LLC.

References:

1. Blaskovich MAT (2020) Antibiotics Special Issue: Challenges and Opportunities in Antibiotic Discovery and Development . ACS Infect. Dis 6 :1286–1288. [PubMed] [Google Scholar]

2. Ribeiro da Cunha B; Fonseca LP; Calado CRC (2019) Antibiotic discovery: Where have we come from, where do we go? Antibiotics 8 :45. [PMC free article] [PubMed] [Google Scholar]

3. Yu W; Guvench O; MacKerell AD (2013) Computational approaches for the design of protein–protein interaction inhibitors. In: Zinzalla G (ed) Understanding and exploiting protein–protein interactions as drug targets . Future Science Ltd., London, UK, pp 99–102. [Google Scholar]

4. Yu W; MacKerell AD (2017) Computer-aided drug design method. In: Sass P (ed) Antibiotics methods and protocols. Methods in Molecular Biology . Springer Science+Business Media, New York, USA, pp 85–106. [PMC free article] [PubMed] [Google Scholar]

5. Krebs FS; Esque J; Stote RH (2019) A computational study of the molecular basis of antibiotic resistance in a DXR mutant . J Comput Aided Mol Des 33 :927–940. [PubMed] [Google Scholar]

6. Li J; Beuerman R; Verma CS (2020) Dissecting the Molecular Mechanism of Colistin Resistance in mcr-1 Bacteria . J Chem Inf Model . 60 :4975–4984. [PubMed] [Google Scholar]

7. Liu Y; Wang Y; Walsh TR; Yi L; Zhang R; Spencer J; Doi Y; Tian G; Dong B; Huang X; Yu L; Gu D; Ren H; Chen X; Lv L; He D; Zhou H; Liang Z; Liu J; Shen J (2016) Emergence of plasmid-mediated colistin resistance mechanism MCR-1 in animals and human beings in China: a microbiological and molecular biological study . Lancet Infect Dis . 16 :161–168. [PubMed] [Google Scholar]

8. O’Neill MJ; Wilks A (2013) The P. aeruginosa Heme Binding Protein PhuS Is a Heme Oxygenase Titratable Regulator of Heme Uptake . ACS Chem Biol 8 :1794–1802. [PMC free article] [PubMed] [Google Scholar]

9. Nguyen AT; O’Neill MJ; Watts AM; Robson CL; Lamont IL; Wilks A; Oglesby-Sherrouse AG (2014) Adaptation of Iron Homeostasis Pathways by a Pseudomonas aeruginosa Pyoverdine Mutant in the Cystic Fibrosis Lung . J Bacteriol 196 :2265–2276. [PMC free article] [PubMed] [Google Scholar]

10. Liang D; Robinson E; Hom K; Yu W; Nguyen N; Li Y; Zong Q; Wilks A; Xue F (2018) Structure-based design and biological evaluation of inhibitors of the pseudomonas aeruginosa heme oxygenase (pa-HemO) . Bioorg Med Chem Lett . 28 : 1024–1029. [PMC free article] [PubMed] [Google Scholar]

11. Xu X; Godoy-Ruiz R; Adipietro KA; Peralta C; Ben-Hail D; Varney KM; Cook ME; Roth BM; Wilder PT; Cleveland T; Grishaev A; Neu HM; Michel SL; Yu W; Beckett D; Rustandi RR; Lancaster C; Loughney JW; Kristopeit A; Christanti S; Olson JW; MacKerell AD; des Georges A; Pozharski E; Weber DJ (2020) Structure of the cell-binding component of the Clostridium difficile binary toxin reveals a di-heptamer macromolecular assembly . Proc Natl Acad Sci U.S.A 117 :1049–1058. [PMC free article] [PubMed] [Google Scholar]

12. Varney KM; Bonvin AMJJ; Pazgier M; Malin J; Yu W; Ateh E; Oashi T; Lu W; Huang J; Diepeveen-de Buin M; Bryant J; Breukink E; MacKerell AD Jr.; de Leeuw EPH (2013) Turning Defense into Offense: Defensin Mimetics as Novel Antibiotics Targeting Lipid II . PLoS Pathog 9 :e1003732. [PMC free article] [PubMed] [Google Scholar]

13. Fletcher S; Yu W; Huang J; Kwasny SM; Chauhan J; Opperman TJ; Jr ADM; de Leeuw EP (2015) Structure-activity exploration of a small-molecule Lipid II inhibitor . Drug Des Devel Ther 9 :2383–2394. [PMC free article] [PubMed] [Google Scholar]

14. Chauhan J; Yu W; Cardinale S; Opperman TJ; MacKerell AD; Fletcher S; de Leeuw EP (2020) Optimization of a Benzothiazole Indolene Scaffold Targeting Bacterial Cell Wall Assembly . Drug Des Devel Ther 14 :567–574. [PMC free article] [PubMed] [Google Scholar]

15. Tooke CL; Hinchliffe P; Bragginton EC; Colenso CK; Hirvonen VHA; Takebayashi Y; Spencer J (2019) β-Lactamases and β-Lactamase Inhibitors in the 21st Century . J Mol Biol . 431 : 3472–3500. [PMC free article] [PubMed] [Google Scholar]

16. Guvench O; MacKerell AD Jr. (2009) Computational Fragment-Based Binding Site Identification by Ligand Competitive Saturation . PLoS Computational Biology 5 :e1000435. [PMC free article] [PubMed] [Google Scholar]

17. Raman EP; Yu W; Guvench O; MacKerell AD (2011) Reproducing Crystal Binding Modes of Ligand Functional Groups Using Site-Identification by Ligand Competitive Saturation (SILCS) Simulations . Journal of Chemical Information and Modeling 51 :877–896. [PMC free article] [PubMed] [Google Scholar]

18. Raman EP; Yu W; Lakkaraju SK; MacKerell AD (2013) Inclusion of Multiple Fragment Types in the Site Identification by Ligand Competitive Saturation (SILCS) Approach . Journal of Chemical Information and Modeling 53 :3384–3398. [PMC free article] [PubMed] [Google Scholar]

19. Parvaiz N; Ahmad F; Yu W; MacKerell AD; Azam SS (2021) Discovery of beta-lactamase CMY-10 inhibitors for combination therapy against multi-drug resistant Enterobacteriaceae . PLoS ONE 16 :e0244967. [PMC free article] [PubMed] [Google Scholar]

20. Faller C; Raman EP; MacKerell A Jr.; Guvench O (2015) Site Identification by Ligand Competitive Saturation (SILCS) Simulations for Fragment-Based Drug Design. In: Klon AE (ed) Fragment-Based Methods in Drug Discovery . Springer; New York, pp 75–87. [PMC free article] [PubMed] [Google Scholar]

21. Yu W; Lakkaraju S; Raman EP; MacKerell A Jr. (2014) Site-Identification by Ligand Competitive Saturation (SILCS) assisted pharmacophore modeling . J Comput Aided Mol Des 28 :491–507. [PMC free article] [PubMed] [Google Scholar]

22. Yu W; Lakkaraju SK; Raman EP; Fang L; MacKerell AD (2015) Pharmacophore Modeling Using Site-Identification by Ligand Competitive Saturation (SILCS) with Multiple Probe Molecules . J Chem Inf Model 55 :407–420. [PMC free article] [PubMed] [Google Scholar]

23. Abel R; Wang L; Harder ED; Berne BJ; Friesner RA (2017) Advancing drug discovery through enhanced free energy calculations . Acc Chem Res . 50 :1625–1632. [PubMed] [Google Scholar]

24. King E; Aitchison E; Li H; Luo R (2021) Recent developments in free energy calculations for drug discovery . Front Mol Biosci . 8 :712085. [PMC free article] [PubMed] [Google Scholar]

25. Chen J; Wang X; Pang L; Zhang JZH; Zhu T (2019) Effect of mutations on binding of ligands to guanine riboswitch probed by free energy perturbation and molecular dynamics simulations . Nucleic Acids Res . 47 :6618–6631. [PMC free article] [PubMed] [Google Scholar]

26. Fowler PW (2020) How quickly can we predict trimethoprim resistance using alchemical free energy methods? Interface Focus . 10 :20190141. [PMC free article] [PubMed] [Google Scholar]

27. Vamathevan J; Clark D; Czodrowski P; Dunham I; Ferran E; Lee G; Li B; Madabhushi A; Shah P; Spitzer M; Zhao S (2019) Applications of machine learning in drug discovery and development . Nat Rev Drug Discov . 18 :463–477. [PMC free article] [PubMed] [Google Scholar]

28. Jackson PC Jr. (2019) Introduction to artificial intelligence : third edition. Dover Publications Inc. Mineola, New York, USA. [Google Scholar]

29. Hosny A; Parmar C; Quackenbush J; Schwartz LH; Aerts HJWL (2018) Artificial intelligence in radiology . Nat Rev Cancer 18 :500–510. [PMC free article] [PubMed] [Google Scholar]

30. Manning CD (2015) Computational Linguistics and Deep Learning . Comput Linguist . 41 :701–707. [Google Scholar]

31. Owens JD; Houston M; Luebke D; Green S; Stone JE; Phillips JC (2008) GPU Computing . Proc. IEEE 96 : 879–899. [Google Scholar]

32. Melo MCR; Maasch JRMA; de la Fuente-Nunez C (2021) Accelerating antibiotic discovery through artificial intelligence . Commun Biol 4 :1050. [PMC free article] [PubMed] [Google Scholar]

33. Anahtar MN; Yang JH; Kanjilal S (2021) Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research . J Clin Microbiol 59 : e01260–20. [PMC free article] [PubMed] [Google Scholar]

34. Hyun JC; Kavvas ES; Monk JM; Palsson BO (2020) Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens . PLoS Comput Biol 16 :e1007608. [PMC free article] [PubMed] [Google Scholar]

35. Stokes JM; Yang K; Swanson K; Jin W; Cubillos-Ruiz A; Donghia NM; MacNair CR; French S; Carfrae LA; Bloom-Ackerman Z; Tran VM; Chiappino-Pepe A; Badran AH; Andrews IW; Chory EJ; Church GM; Brown ED; Jaakkola TS; Barzilay R; Collins JJ (2020) A deep learning approach to antibiotic discovery . Cell 180 :688–702. [PMC free article] [PubMed] [Google Scholar]

36. Corsello SM; Bittker JA; Liu Z; Gould J; McCarren P; Hirschman JE; Johnston SE; Vrcic A; Wong B; Khan M; Asiedu J; Narayan R; Mader CC; Subramanian A; Golub TR (2017) The Drug Repurposing Hub: a next-generation drug library and information resource . Nat. Med 23 :405–408. [PMC free article] [PubMed] [Google Scholar]

37. Towns J; Cockerill T; Dahan M; Foster I; Gaither K; Grimshaw A; Hazlewood V; Lathrop S; Lifka D; Peterson GD; Roskies R; Scott JR; Wilkins-Diehr N (2014) XSEDE: accelerating scientific discovery . Comput Sci Eng . 16 :62–74. [Google Scholar]

38. Kotas C; Naughton T; Imam N (2018) A comparison of Amazon Web Services and Microsoft Azure cloud platforms for high performance computing . 2018 IEEE International Conference on Consumer Electronics (ICCE), pp 1–4. [Google Scholar]

39. Brooks BR; Brooks CL; Mackerell AD; Nilsson L; Petrella RJ; Roux B; Won Y; Archontis G; Bartels C; Boresch S; Caflisch A; Caves L; Cui Q; Dinner AR; Feig M; Fischer S; Gao J; Hodoscek M; Im W; Kuczera K; Lazaridis T; Ma J; Ovchinnikov V; Paci E; Pastor RW; Post CB; Pu JZ; Schaefer M; Tidor B; Venable RM; Woodcock HL; Wu X; Yang W; York DM; Karplus M (2009) CHARMM: The biomolecular simulation program . J Comput Chem . 30 :1545–1614. [PMC free article] [PubMed] [Google Scholar]

40. Van Der Spoel D; Lindahl E; Hess B; Groenhof G; Mark AE; Berendsen HJC (2005) GROMACS: Fast, flexible, and free . J Comput Chem . 26 :1701–1718. [PubMed] [Google Scholar]

41. Phillips JC; Hardy DJ; Maia JD; Stone JE; Ribeiro JV; Bernardi RC; Buch R; Fiorin G; Hénin J; Jiang W; McGreevy R (2020) Scalable molecular dynamics on CPU and GPU architectures with NAMD . J Chem Phys , 153 :044130. [PMC free article] [PubMed] [Google Scholar]

42. Eastman P; Swails J; Chodera JD; McGibbon RT; Zhao Y; Beauchamp KA; Wang L; Simmonett AC; Harrigan MP; Stern CD; Wiewiora RP; Brooks BR; Pande VS (2017) OpenMM 7: Rapid development of high performance algorithms for molecular dynamics . PLoS Comput Biol 13 :e1005659. [PMC free article] [PubMed] [Google Scholar]

43. Hynninen A; Crowley MF (2014) New faster CHARMM molecular dynamics engine . J Comput Chem . 35 :406–413. [PMC free article] [PubMed] [Google Scholar]

44. Kohnke B; Kutzner C; Grubmuller H (2020) A GPU-Accelerated Fast Multipole Method for GROMACS: Performance and Accuracy . J Chem Theory Comput . 16 :6938–6949. [PMC free article] [PubMed] [Google Scholar]

45. Harvey MJ; Giupponi G; De Fabritiis G (2009) ACEMD: Accelerating biomolecular dynamics in the microsecond time scale . J Chem Theory Comput . 5 :1632–1639. [PubMed] [Google Scholar]

46. Bernstein FC; Koetzle TF; Williams GJB; Meyer EF Jr; Brice MD; Rodgers JR; Kennard O; Shimanouchi T; Tasumi M (1977) The protein data bank: A computer-based archival file for macromolecular structures . J Mol Biol . 112 :535–542. [PubMed] [Google Scholar]

47. Renaud JP; Chari A; Ciferri C; Liu W; Remigy H; Stark H; Wiesmann C (2018) Cryo-EM in drug discovery: achievements, limitations and prospects . Nat Rev Drug Discov . 17 :471–492. [PubMed] [Google Scholar]

48. Baek M; DiMaio F; Anishchenko I; Dauparas J; Ovchinnikov S; Lee GR; Wang J; Cong Q; Kinch LN; Schaeffer RD; Millán C; Park H; Adams C; Glassman CR; DeGiovanni A; Pereira JH; Rodrigues AV; van Dijk AA; Ebrecht AC; Opperman DJ; Sagmeister T; Buhlheller C; Pavkov-Keller T; Rathinaswamy MK; Dalwadi U; Yip CK; Burke JE; Garcia KC; Grishin NV; Adams PD; Read RJ; Baker D (2021) Accurate prediction of protein structures and interactions using a three-track neural network . Science . 373 :871–876. [PMC free article] [PubMed] [Google Scholar]

49. Jumper J; Evans R; Pritzel A; Green T; Figurnov M; Ronneberger O; Tunyasuvunakool K; Bates R; Žídek A; Potapenko A; Bridgland A; Meyer C; Kohl SAA; Ballard AJ; Cowie A; Romera-Paredes B; Nikolov S; Jain R; Adler J; Back T; Petersen S; Reiman D; Clancy E; Zielinski M; Steinegger M; Pacholska M; Berghammer T; Bodenstein S; Silver D; Vinyals O; Senior AW; Kavukcuoglu K; Kohli P; Hassabis D (2021) Highly accurate protein structure prediction with AlphaFold . Nature 596 :583–589. [PMC free article] [PubMed] [Google Scholar]

50. Tunyasuvunakool K; Adler J; Wu Z; Green T; Zielinski M; Žídek A; Bridgland A; Cowie A; Meyer C; Laydon A; Velankar S; Kleywegt GJ; Bateman A; Evans R; Pritzel A; Figurnov M; Ronneberger O; Bates R; Kohl SAA; Potapenko A; Ballard AJ; Romera-Paredes B; Nikolov S; Jain R; Clancy E; Reiman D; Petersen S; Senior AW; Kavukcuoglu K; Birney E; Kohli P; Jumper J; Hassabis D (2021) Highly accurate protein structure prediction for the human proteome . Nature 596 :590–596. [PMC free article] [PubMed] [Google Scholar]

51. Varadi M; Anyango S; Deshpande M; Nair S; Natassia C; Yordanova G; Yuan D; Stroe O; Wood G; Laydon A; Žídek A; Green T; Tunyasuvunakool K; Petersen S; Jumper J; Clancy E; Green R; Vora A; Lutfi M; Figurnov M; Cowie A; Hobbs N; Kohli P; Kleywegt G; Birney E; Hassabis D; Velankar S (2022) AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models , Nucleic Acids Research , 50 :D439–D444. [PMC free article] [PubMed] [Google Scholar]

52. MacKerell AD; Bashford D; Bellott M; Dunbrack RL; Evanseck JD; Field MJ; Fischer S; Gao J; Guo H; Ha S; Joseph-McCarthy D; Kuchnir L; Kuczera K; Lau FTK; Mattos C; Michnick S; Ngo T; Nguyen DT; Prodhom B; Reiher WE; Roux B; Schlenkrich M; Smith JC; Stote R; Straub J; Watanabe M; Wiórkiewicz-Kuczera J; Yin D; Karplus M (1998) All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins . J Phys Chem B 102 :3586–3616. [PubMed] [Google Scholar]

53. Best RB; Zhu X; Shim J; Lopes PEM; Mittal J; Feig M; MacKerell AD (2012) Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles . Journal of Chemical Theory and Computation 8 :3257–3273. [PMC free article] [PubMed] [Google Scholar]

54. Vanommeslaeghe K; Hatcher E; Acharya C; Kundu S; Zhong S; Shim J; Darian E; Guvench O; Lopes P; Vorobyov I; Mackerell AD (2010) CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields . Journal of Computational Chemistry 31 :671–690. [PMC free article] [PubMed] [Google Scholar]

55. Yu W; He X; Vanommeslaeghe K; MacKerell AD (2012) Extension of the CHARMM general force field to sulfonyl-containing compounds and its utility in biomolecular simulations . Journal of Computational Chemistry 33 :2451–2468. [PMC free article] [PubMed] [Google Scholar]

56. Cornell WD; Cieplak P; Bayly CI; Gould IR; Merz KM; Ferguson DM; Spellmeyer DC; Fox T; Caldwell JW; Kollman PA (1995) A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules . Journal of the American Chemical Society 117 :5179–5197. [Google Scholar]

57. Wang J; Wolf RM; Caldwell JW; Kollman PA; Case DA (2004) Development and testing of a general amber force field . Journal of Computational Chemistry 25 :1157–1174. [PubMed] [Google Scholar]

58. Vanommeslaeghe K; MacKerell AD (2012) Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing . J Chem Infor Model . 52 :3144–3154. [PMC free article] [PubMed] [Google Scholar]

59. Vanommeslaeghe K; Raman EP; MacKerell AD (2012) Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges . J Chem Infor Model . 52 :3155–3168. [PMC free article] [PubMed] [Google Scholar]

60. Kumar A; Yoluk O; MacKerell AD (2019) FFParam: Standalone package for CHARMM additive and Drude polarizable force field parametrization of small molecules . J Comput Chem . 41 :958–970. [PMC free article] [PubMed] [Google Scholar]

61. Lopes PEM; Huang J; Shim J; Luo Y; Li H; Roux B; MacKerell AD Jr. (2013) Polarizable Force Field for Peptides and Proteins Based on the Classical Drude Oscillator . J. Chem. Theory Comput 9 :5430–5449. [PMC free article] [PubMed] [Google Scholar]

62. Lemkul JA; Huang J; Roux B; MacKerell AD (2016) An Empirical Polarizable Force Field Based on the Classical Drude Oscillator Model: Development History and Recent Applications . Chem Rev . 116 :4983–5013. [PMC free article] [PubMed] [Google Scholar]

63. Ponder JW; Wu C; Ren P; Pande VS; Chodera JD; Schnieders MJ; Haque I; Mobley DL; Lambrecht DS; DiStasio RA; Head-Gordon M; Clark GNI; Johnson ME; Head-Gordon T (2010) Current status of the amoeba polarizable force field . J Phys Chem B 114 :2549–2564. [PMC free article] [PubMed] [Google Scholar]

64. Huang J; Lopes PEM; Roux B; MacKerell AD Jr. (2014) Recent Advances in Polarizable Force Fields for Macromolecules: Microsecond Simulations of Proteins Using the Classical Drude Oscillator Model . J. Phys. Chem. Lett 5 :3144–3150. [PMC free article] [PubMed] [Google Scholar]

65. Huang J; Lemkul JA; Eastman PK; MacKerell AD (2018) Molecular dynamics simulations using the drude polarizable force field on GPUs with OpenMM: Implementation, validation, and benchmarks . J Comput Chem . 39 :1682–1689. [PMC free article] [PubMed] [Google Scholar]

66. Mobley DL; Bannan CC; Rizzi A; Bayly CI; Chodera JD; Lim VT; Lim NM; Beauchamp KA; Slochower DR; Shirts MR; Gilson MK; Eastman PK (2018) Escaping Atom Types in Force Fields Using Direct Chemical Perception . J Chem Theory Comput . 14 :6076–6092. [PMC free article] [PubMed] [Google Scholar]

67. Qiu Y; Smith DGA; Boothroyd S; Jang H; Hahn DF; Wagner J; Bannan CC; Gokey T; Lim VT; Stern CD; Rizzi A; Tjanaka B; Tresadern G; Lucas X; Shirts MR; Gilson MK; Chodera JD; Bayly CI; Mobley DL; Wang LP (2021) Development and Benchmarking of Open Force Field v1.0.0-the Parsley Small-Molecule Force Field . J Chem Theory Comput . 17 :6262–6280. [PMC free article] [PubMed] [Google Scholar]

68. Unke OT; Meuwly M (2019) PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges . J Chem Theory Comput . 15 :3678–3693. [PubMed] [Google Scholar]

69. Poltavsky I; Tkatchenko A (2021) Machine Learning Force Fields: Recent Advances and Remaining Challenges . J Phys Chem Lett . 12 :6551–6564. [PubMed] [Google Scholar]

70. Bender BJ; Gahbauer S; Luttens A; Lyu J; Webb CM; Stein RM; Fink EA; Balius TE; Carlsson J; Irwin JJ; Shoichet BK (2021) A practical guide to large-scale docking . Nat Protoc . 16 :4799–4832. [PMC free article] [PubMed] [Google Scholar]

71. Schaller D; Šribar D; Noonan T; Deng L; Nguyen TN; Pach S; Machalz D; Bermudez M; Wolber G (2020) Next generation 3D pharmacophore modeling . Wiley Interdiscip. Rev. Comput. Mol. Sci 10 :e1468. [Google Scholar]

72. Trott O; Olson AJ (2010) AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading . Journal of Computational Chemistry 31 :455–461. [PMC free article] [PubMed] [Google Scholar]

73. Verdonk ML; Cole JC; Hartshorn MJ; Murray CW; Taylor RD (2003) Improved protein-ligand docking using GOLD . Proteins: Struct. Funct. Bioinf 52 :609–623. [PubMed] [Google Scholar]

74. Pagadala NS; Syed K; Tuszynski J (2017) Software for molecular docking: a review . Biophys Rev . 9 :91–102. [PMC free article] [PubMed] [Google Scholar]

75. Wójcikowski M; Zielenkiewicz P; Siedlecki P (2015) Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field . J Cheminform . 7 :26. [PMC free article] [PubMed] [Google Scholar]

76. Gorgulla C; Boeszoermenyi A; Wang ZF; Fischer PD; Coote PW; Padmanabha Das KM; Malets YS; Radchenko DS; Moroz YS; Scott DA; Fackeldey K; Hoffmann M; Iavniuk I; Wagner G; Arthanari H (2020) An open-source drug discovery platform enables ultra-large virtual screens . Nature . 580 :663–668. [PMC free article] [PubMed] [Google Scholar]

77. Kochnev Y; Hellemann E; Cassidy KC; Durrant JD (2020) Webina: an open-source library and web app that runs AutoDock Vina entirely in the web browser . Bioinformatics . 36 :4513–4515. [PMC free article] [PubMed] [Google Scholar]

78. Murail S; de Vries SJ; Rey J; Moroy G; Tufféry P (2021) SeamDock: An Interactive and Collaborative Online Docking Resource to Assist Small Compound Molecular Docking . Front Mol Biosci . 8 :716466. [PMC free article] [PubMed] [Google Scholar]

79. Koes DR; Camacho CJ (2011) Pharmer: Efficient and Exact Pharmacophore Search . Journal of Chemical Information and Modeling 51 :1307–1314. [PMC free article] [PubMed] [Google Scholar]

80. Koes DR; Camacho CJ (2012) ZINCPharmer: pharmacophore search of the ZINC database . Nucleic Acids Res . 40 :W409–W414. [PMC free article] [PubMed] [Google Scholar]

81. Sunseri J; Koes DR (2016) Pharmit: interactive exploration of chemical space . Nucleic Acids Res . 44 :W442–W448. [PMC free article] [PubMed] [Google Scholar]

82. Irwin JJ; Tang KG; Young J; Dandarchuluun C; Wong BR; Khurelbaatar M; Moroz YS; Mayfield J; Sayle RA (2020) ZINC20-A Free Ultralarge-Scale Chemical Database for Ligand Discovery . J Chem Inf Model . 60 :6065–6073. [PMC free article] [PubMed] [Google Scholar]

84. Grygorenko OO; Radchenko DS; Dziuba I; Chuprina A; Gubina KE; Moroz YS (2020) Generating multibillion chemical space of readily accessible screening compounds . iScience . 23 :101681. [PMC free article] [PubMed] [Google Scholar]

85. Boyd NK; Teng C; Frei CR Brief overview of approaches and challenges in new antibiotic development: A focus on drug repurposing . (2021) Front Cell Infect Microbiol . 11 :684515. [PMC free article] [PubMed] [Google Scholar]

86. Konreddy AK; Rani GU; Lee K; Choi Y (2019) Recent drug-repurposing-driven advances in the discovery of novel antibiotics . Curr Med Chem . 26 :5363–5388. [PubMed] [Google Scholar]

87. Discovery Studio Modeling Environment , Dassault Systèmes BIOVIA, https://www.3ds.com/products-services/biovia/ : San Diego, CA, USA. [Google Scholar]

88. Molecular Operating Environment (MOE) , Chemical Computing Group Inc., https://www.chemcomp.com : Montreal, Canada. [Google Scholar]

89. OEChem , OpenEye Scientific Software, Inc. https://www.eyesopen.com : Santa Fe, NM, USA. [Google Scholar]

90. SILCS , SilcsBio, LLC. https://www.silcsbio.com : Baltimore, MD, USA. [Google Scholar] 91. PlayMolecule , Acellera Inc., https://www.acellera.com : Barcelona, Spain. [Google Scholar]

92. Muhammed MT; Aki-Yalcin E (2019) Homology modeling in drug discovery: Overview, current applications, and future perspectives . Chem Biol Drug Des . 93 :12–20. [PubMed] [Google Scholar]

93. Moore PB; Hendrickson WA; Henderson R; Brunger AT (2022) The protein-folding problem: Not yet solved . Science . 375 :507. [PubMed] [Google Scholar]

94. Hollingsworth SA; Dror RO (2018) Molecular Dynamics Simulation for All . Neuron . 99 :1129–1143. [PMC free article] [PubMed] [Google Scholar]

95. Lamoureux G; Harder E; Vorobyov IV; Roux B; MacKerell AD (2006) A polarizable model of water for molecular dynamics simulations of biomolecules Chem . Phys. Lett 418 :245–249. [Google Scholar]

96. Yu W; Lopes PEM; Roux B; MacKerell AD (2013) Six-Site Polarizable Model of Water Based on the Classical Drude Oscillator J. Chem. Phys 138 :034508. [PMC free article] [PubMed] [Google Scholar]

97. Lin F; Huang J; Pandey P; Rupakheti C; Li J; Roux BT; MacKerell AD (2020) Further Optimization and Validation of the Classical Drude Polarizable Protein Force Field . J Chem Theory Comput . 16 :3221–3239. [PMC free article] [PubMed] [Google Scholar]

98. Shi Y; Xia Z; Zhang J; Best R; Wu C; Ponder JW; Ren P (2013) The polarizable atomic multipole-based AMOEBA force field for proteins . J Chem Theory Comput . 9 :4046–4063. [PMC free article] [PubMed] [Google Scholar]

99. Kunz AP; van Gunsteren WF (2009) Development of a nonlinear classical polarization model for liquid water and aqueous solutions: COS/D . J Phys Chem A . 113 :11570–11579. [PubMed] [Google Scholar]

100. Visscher KM; Geerke DP (2020) Deriving a polarizable force field for biomolecular building blocks with minimal empirical calibration . J Phys Chem B . 124 :1628–1636. [PMC free article] [PubMed] [Google Scholar]

101. Donchev AG; Ozrin VD; Subbotin MV; Tarasov OV; Tarasov VI (2005) A quantum mechanical polarizable force field for biomolecular interactions . Proc. Natl. Acad. Sci. U. S. A 102 , 7829–7834. [PMC free article] [PubMed] [Google Scholar]

102. Goel H; Yu W; Ustach VD; Aytenfisu AH; Sun D; MacKerell AD (2020) Impact of electronic polarizability on protein-functional group interactions . Phys Chem Chem Phys . 22 :6848–6860. [PMC free article] [PubMed] [Google Scholar]

103. Jo S; Cheng X; Lee J; Kim S; Park SJ; Patel DS; Beaven AH; Lee KI; Rui H; Park S; Lee HS; Roux B; MacKerell AD; Klauda JB; Qi Y; Im W (2017) CHARMM-GUI 10 years for biomolecular modeling and simulation . J Comput Chem . 38 :1114–1124. [PMC free article] [PubMed] [Google Scholar]

104. Kognole A; Lee J; Park SJ; Jo S; Chatterjee P; Lemkul JA; Huang J; MacKerell AD; Im W (2022) CHARMM-GUI Drude prepper for molecular dynamics simulation using the classical Drude polarizable force field . J Comput Chem . 43 :359–375. [PMC free article] [PubMed] [Google Scholar]

105. Chowdhary J; Harder E; Lopes PE; Huang L; MacKerell AD; Roux B (2013) A polarizable force field of dipalmitoylphosphatidylcholine based on the classical Drude model for molecular dynamics simulations of lipids . J Phys Chem B . 117 :9142–9160. [PMC free article] [PubMed] [Google Scholar]

106. Lamoureux G; MacKerell AD; Roux B (2003) A simple polarizable model of water based on classical Drude oscillators . J. Chem. Phys 119 :5185–5197. [Google Scholar]

107. Genheden S; Ryde U (2015) The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities . Expert Opin Drug Discov . 10 :449–461. [PMC free article] [PubMed] [Google Scholar]

108. Ustach VD; Lakkaraju SK; Jo S; Yu W; Jiang W; MacKerell AD (2019) Optimization and evaluation of Site-Identification by Ligand Competitive Saturation (SILCS) as a tool for target-based ligand optimization . J Chem Inf Model . 59 :3018–3035. [PMC free article] [PubMed] [Google Scholar]

109. Goel H; Hazel A; Ustach VD; Jo S; Yu W; MacKerell AD (2021) Rapid and accurate estimation of protein-ligand relative binding affinities using site-identification by ligand competitive saturation . Chem Sci . 12 :8844–8858. [PMC free article] [PubMed] [Google Scholar]

110. Goel H; Hazel A; Yu W; Jo S; MacKerell AD (2022) Application of site-identification by ligand competitive saturation in computer-aided drug design . New J Chem . 46 :919–932. [PMC free article] [PubMed] [Google Scholar]

111. Lanning ME; Yu W; Yap JL; Chauhan J; Chen L; Whiting E; Pidugu LS; Atkinson T; Bailey H; Li W; Roth BM; Hynicka L; Chesko K; Toth EA; Shapiro P; MacKerell AD; Wilder PT; Fletcher S (2016) Structure-based design of N-substituted 1-hydroxy-4-sulfamoyl-2-naphthoates as selective inhibitors of the Mcl-1 oncoprotein . Eur J Med Chem . 113 :273–292. [PMC free article] [PubMed] [Google Scholar]

112. Young BD; Yu W; Rodríguez DJV; Varney KM; MacKerell AD; Weber DJ (2021) Specificity of molecular fragments binding to S100B versus S100A1 as identified by NMR and site identification by ligand competitive saturation (SILCS) . Molecules . 26 :381. [PMC free article] [PubMed] [Google Scholar]

113. Broomhead NK; Soliman ME (2017) Can we rely on computational predictions to correctly identify ligand binding sites on novel protein drug targets? Assessment of binding site prediction methods and a protocol for validation of predicted binding sites . Cell Biochem Biophys . 75 :15–23. [PubMed] [Google Scholar]

114. Shanina E; Kuhaudomlarp S; Lal K; Seeberger PH; Imberty A; Rademacher C (2022) Druggable allosteric sites in β-Propeller lectins . Angew Chem Int Ed . 61 :e202109339. [PMC free article] [PubMed] [Google Scholar]

115. MacKerell AD; Jo S; Lakkaraju SK; Lind C; Yu W (2020) Identification and characterization of fragment binding sites for allosteric ligand design using the site identification by ligand competitive saturation hotspots approach (SILCS-Hotspots) . Biochim Biophys Acta Gen Subj . 1864 :129519. [PMC free article] [PubMed] [Google Scholar]

116. O’Reilly M; Cleasby A; Davies TG; Hall RJ; Ludlow RF; Murray CW; Tisi D; Jhoti H (2019) Crystallographic screening using ultra-low-molecular-weight ligands to guide drug design . Drug Discov. Today 24 :1081–1086. [PubMed] [Google Scholar]

117. Taylor RD; MacCoss M; Lawson AD (2014) Rings in drugs . J Med Chem . 57 , 5845–5859. [PubMed] [Google Scholar]

118. Ness S; Martin R; Kindler AM; Paetzel M; Gold M; Jensen SE; Jones JB; Strynadka NC (2000) Structure-based design guides the improved efficacy of deacylation transition state analogue inhibitors of TEM-1 beta-Lactamase . Biochemistry . 39 :5312–5321. [PubMed] [Google Scholar]

119. Horn JR; Shoichet BK (2004) Allosteric inhibition through core disruption . J Mol Biol . 336 :1283–1291. [PubMed] [Google Scholar]

120. Trisciuzzi D; Nicolotti O; Miteva MA; Villoutreix BO (2019) Analysis of solvent-exposed and buried co-crystallized ligands: a case study to support the design of novel protein-protein interaction inhibitors . Drug Discov Today . 24 :551–559. [PubMed] [Google Scholar]

121. Mitternacht S (2016) FreeSASA: An open source C library for solvent accessible surface area calculations . F1000Res . 5 :189. [PMC free article] [PubMed] [Google Scholar]

122. Delcour AH (2009) Outer membrane permeability and antibiotic resistance . Biochim Biophys Acta . 1794 :808–816. [PMC free article] [PubMed] [Google Scholar]

123. May KL; Grabowicz M (2018) The bacterial outer membrane is an evolving antibiotic barrier . Proc Natl Acad Sci U S A . 115 :8852–8854. [PMC free article] [PubMed] [Google Scholar]

124. Bennion BJ; Be NA; McNerney MW; Lao V; Carlson EM; Valdez CA; Malfatti MA; Enright HA; Nguyen TH; Lightstone FC; Carpenter TS (2017) Predicting a Drug’s Membrane Permeability: A Computational Model Validated With in Vitro Permeability Assay Data . J Phys Chem B . 121 :5228–5237. [PubMed] [Google Scholar]

125. Marrink S; Berendsen HJC (1994) Simulation of water transport through a lipid membrane . J. Phys. Chem 98 :4155–4168. [Google Scholar]

126. Lind C; Pandey P; Pastor RW; MacKerell AD (2021) Functional group distributions, partition coefficients, and resistance factors in lipid bilayers using site identification by ligand competitive saturation . J. Chem. Theory Comput 17 :3188–3202. [PMC free article] [PubMed] [Google Scholar]

127. Gao Y; Lee J; Widmalm G; Im W (2020) Modeling and Simulation of Bacterial Outer Membranes with Lipopolysaccharides and Enterobacterial Common Antigen . J. Phys. Chem. B 124 :5948–5956. [PubMed] [Google Scholar]

128. Kansy M; Senner F; Gubernator K (1998) Physicochemical high throughput screening: parallel artificial membrane permeation assay in the description of passive absorption processes . J Med Chem . 41 :1007–1010. [PubMed] [Google Scholar]

129. Lee J; Patel DS; Ståhle J; Park SJ; Kern NR; Kim S; Lee J; Cheng X; Valvano MA; Holst O; Knirel YA; Qi Y; Jo S; Klauda JB; Widmalm G; Im W (2019) CHARMM-GUI Membrane Builder for Complex Biological Membrane Simulations with Glycolipids and Lipoglycans . J Chem Theory Comput . 15 :775–786. [PubMed] [Google Scholar]

130. Carro L (2018) Protein-protein interactions in bacteria: a promising and challenging avenue towards the discovery of new antibiotics . Beilstein J Org Chem . 14 :2881–2896. [PMC free article] [PubMed] [Google Scholar]

131. Cossar PJ; Lewis PJ; McCluskey A (2020) Protein-protein interactions as antibiotic targets: A medicinal chemistry perspective . Med Res Rev . 40 :469–494. [PubMed] [Google Scholar]

132. Kahan R; Worm DJ; de Castro GV; Ng S; Barnard A (2021) Modulators of protein-protein interactions as antimicrobial agents . RSC Chem Biol . 2 :387–409. [PMC free article] [PubMed] [Google Scholar]

133. Huang S (2014) Search strategies and evaluation in protein–protein docking: principles, advances and challenges . Drug Discov Today . 19 :1081–1096. [PubMed] [Google Scholar]

134. Yu W; Jo S; Lakkaraju SK; Weber DJ; MacKerell AD (2019) Exploring protein-protein interactions using the site-identification by ligand competitive saturation methodology. Proteins : Struct. Funct. Bioinf 87 :289–301. [PMC free article] [PubMed] [Google Scholar]

135. Solernou A; Fernandez-Recio J (2010) Protein docking by rotation-based uniform sampling (RotBUS) with fast computing of intermolecular contact distance and residue desolvation . BMC Bioinformatics . 11 :352. [PMC free article] [PubMed] [Google Scholar]

136. Gaile GL; Burt JE (1980) Directional Statistics. Concepts and Techniques in Modern Geography . 25th ed. Norwich: Geo Books. [Google Scholar]

137. Challener C (2018) Fighting bacterial resistance with biologics . Pharm Technol . 42 :36–37. [Google Scholar]

138. Kollef MH; Betthauser KD (2021) Monoclonal antibodies as antibacterial therapies: thinking outside of the box . Lancet Infect Dis . 21 :1201–1202. [PubMed] [Google Scholar]

139. Zurawski DV; McLendon MK (2020) Monoclonal antibodies as an antibacterial approach against bacterial pathogens . Antibiotics (Basel) . 9 :155. [PMC free article] [PubMed] [Google Scholar]

140. Watson A; Li H; Ma B; Weiss R; Bendayan D; Abramovitz L; Ben-Shalom N; Mor M; Pinko E; Bar Oz M; Wang Z; Du F; Lu Y; Rybniker J; Dahan R; Huang H; Barkan D; Xiang Y; Javid B; Freund NT (2021) Human antibodies targeting a Mycobacterium transporter protein mediate protection against tuberculosis . Nat Commun . 12 :602. [PMC free article] [PubMed] [Google Scholar]

141. Shire SJ (2009) Formulation and manufacturability of biologics . Curr Opin Biotechnol . 20 :708–714. [PubMed] [Google Scholar]

142. Kamerzell TJ; Esfandiary R; Joshi SB; Middaugh CR; Volkin DB (2011) Protein-excipient interactions: mechanisms and biophysical characterization applied to protein formulation development . Adv Drug Deliv Rev . 63 :1118–1159. [PubMed] [Google Scholar]

143. Jo S; Xu A; Curtis JE; Somani S; MacKerell AD (2020) Computational Characterization of Antibody-Excipient Interactions for Rational Excipient Selection Using the Site Identification by Ligand Competitive Saturation-Biologics Approach . Mol Pharm . 17 :4323–4333. [PMC free article] [PubMed] [Google Scholar]

144. Somani S; Jo S; Thirumangalathu R; Rodrigues D; Tanenbaum LM; Amin K; MacKerell AD; Thakkar SV (2021) Toward Biotherapeutics Formulation Composition Engineering using Site-Identification by Ligand Competitive Saturation (SILCS) . J Pharm Sci . 110 :1103–1110. [PMC free article] [PubMed] [Google Scholar]