Using supercomputers to understand biomolecular properties

The use of futuristic computers to analyze the biomolecular structures responsible for disease and quickly design a perfect cure is commonplace in science fiction. The overall idea is both hopeful and well reasoned. But determining the motion and functions of biomolecular particles is no easy task, even with today’s most powerful supercomputers.

Scientists at the University of Chicago conduct research to find answers to the function and movement of atoms in biological systems. The team is modifying its code to run on the upcoming Intel-HPE Aurora supercomputer, which is expected to deliver more than two exaflops of maximum double-precision computing performance. Aurora will be located at the US Department of Energy’s (DOE) Argonne National Laboratory. The research group is supported by the Aurora Early Science (ESP) program at the Argonne Leadership Computing Facility (ALCF).

Principal researcher on the ESP project, Dr. Benoît Roux, University of Chicago Explain, “Our The team is running computer simulations on a pre-production supercomputer using Intel hardware and software that will be on the future Aurora supercomputer. The system includes pre-production Intel graphics processing units (GPUs).

Roux says: “The goal of our ESP project is to develop new technologies to simulate virtual models of biomolecular systems with unprecedented accuracy. As we move on to running computer simulations on an exascale supercomputer such as the future Aurora system, we hope that we are moving towards a rational understanding of biological systems.

Define the nature and movement of biochemical atomic particles

Molecules follow the laws of physics, thermodynamics and chemistry in their behavior and movement. Roux indicates that different parts of living cells have structures such as membranes, proteins and enzymes. For example, membranes are thin sheaths formed from lipids that separate different compartments of the cell. Membranes are usually 30 or 40 angstroms thick, but some protein membranes are large and can be 100 angstroms wide. Cell membranes communicate and signal what is happening outside the cell. Proteins generally cross the membrane and have functions such as pumping chemicals through a membrane or controlling the passage of different substances.

There are so many aspects of molecular biochemistry that a computer model only provides an approximation of how molecules move and behave. Biology is very complex and unexpected factors can be discovered during research. Roux says it’s important to compare the results of computer simulations with real experiments performed in laboratories or clinical trials.

“We use such molecular dynamics simulations of all atoms to rigorously calculate conformational free energies and bond free energies. We are particularly interested in understanding the function of biomolecular systems. We are also developing new computational approaches (polarizable force field, solvent boundary potentials, efficient sampling methods) to study biological macromolecular systems,” says Roux.

Biomolecular Force Field Energy Case Study

The main focus of the team’s ESP research is using supercomputer simulations to determine the free energy landscapes underlying the function of two large membrane transporters. The team measures the Ca2+ ATPase (SERCA) and the P-glycoprotein (PGP) multi-drug resistance transporter.1-6 The objective is to better understand their mechanism by carrying out a quantitative characterization of the free energy pathways and landscapes that govern the movements along them.

SERCA and PGP represent two major classes of transporter proteins. Both proteins use adenosine triphosphate (ATP) as an energy source for transport activity involving complex conformational transitions that are tightly coupled to the binding, unbinding, and hydrolysis of ATP/ADP. PGP is a biomedically very important member of the larger superfamily of ATP Binding Cassette transporters and mediators of multi-drug resistance in many types of cancer. PGP is a well-identified membrane transporter with the ability to efflux drug molecules out of the cancer cell, which reduces the effectiveness of chemotherapy. Cancer cells upregulate PGP expression as an adaptive response to escape chemotherapy-induced cell death.

Figure 1 shows an example of these structures. The results of this research may provide important information regarding multidrug resistance in cancer.

Figure 1. Mmembrane-bound structures for the two transport proteins studied in research: Ca2+ SERCA ATPase pump (left) and PGP multidrug transporter (right). Courtesy of Dr. Roux, University of Chicago.

Software used in biomolecular research

The team performs large-scale molecular dynamics (MD) simulations using the Nanoscale Molecular Dynamics (NAMD program) in their research. seven NAMD is a parallel MD code designed for high performance simulation of large biomolecular systems. NAMD supports biological research measuring the dynamics of cellular processes at atomic and sub-nanosecond resolution not achievable by experimental methods.

Preparing to Run on the Aurora Supercomputer

The Roux team has begun migrating NAMD code to test bed systems at the Joint Laboratory for System Evaluation in Argonne for Aurora. Aurora will integrate new Intel technologies such as the Intel Xand-HPC GPUs (codenamed Ponte Vecchio) and the next-generation Intel Xeon Scalable processor (codenamed Sapphire Rapids), both equipped with high-bandwidth memory designed to improve memory utilization. The team uses SYCL compiled by the C++ Parallel Data (DPC++) compiler, which is part of the cross-industry program led by Intel an API initiative designed to unify and simplify the development of applications on various computer architectures.

Wei Jiang, who was previously a postdoctoral researcher at the Roux laboratory, is now at Argonne as a computer science researcher – part of the Catalyst team at the ALCF. It uses SYCL compiled by Intel CPD++ compiler to help port CUDA models to run on the Intel GPU. Jiang indicates that the use oneAPI will also help developers more easily modify code to run on a variety of systems.

jiang indicates that the ALCF team and Intel have started working with the existing NAMD CUDA GPU model. The team used oneAPI tools to convert existing GPU code into a kernel model of C++ code that can run on Intel Xand– HPC GPUs (code name Ponte Vecchio). NAMD’s development efforts will improve cross-platform support by porting CUDA kernels to SYCL with the help of the Intel DPC++ Compatibility Tool. The Intel VTune profiler is used to help improve GPU utilization and overall performance of new SYCL cores.

Jiang indicates that Intel offers workshops and support for the ALCF team. The team currently has access to the Intel oneMKL library, and Intel engineers are helping to debug code issues for code designed to run on the future exascale Aurora system.

Jiang says, “oneAPI tools are very convenient because they contain a comprehensive compiler and linker. Additionally, the Intel VTune profiler is included, which helps with performance issues. oneAPI is designed to run on different GPUs, which minimizes the task of writing code for various architectures.

Future research on complex biological systems

“Current supercomputers can simulate a few hundred microseconds for an average-sized biological system, but they are still limited. Molecular motions take place over a wide range of time scales, ranging from a fraction of a picosecond to a few milliseconds. Much of the biologically relevant dynamics occur in the microsecond to millisecond range. Existing supercomputers cannot literally simulate the operation of such a complex system [as] a CA driven by the ATP2+ pump. The new frontier of future supercomputers requires theoretical advances to build on the information provided by simulations to understand the function of complex biological systems and to determine precisely what is happening in the system. Ultimately, the goal is to quickly find answers related to disease or drug development,” says Roux.

The ALCF is a DOE Office of Science user facility.

The references:

1. Radak BK, Chipot C, Suh D, et al. Molecular dynamics simulations at constant pH for large biomolecular systems. J Chem Theory Calculation. 2017;13(12):5933-5944. doi: 10.1021/acs.jctc.7b00875

2. Jiang W, Chipot C, Roux B. Calculation of the relative binding affinity of ligands to the receptor: an efficient hybrid approach to single- and dual-topology free-energy perturbation in NAMD. Model J Chem Inf. 2019;59(9):3794-3802. doi: 10.1021/acs.jcim.9b00362

3. Das A, Rui H, Nakamoto R, Roux B. Conformational transitions and alternate access mechanism in the sarcoplasmic reticulum calcium pump. J Mol Biol. 2017;429(5):647-666. doi: 10.1016/j.jmb.2017.01.007

4. Thirman J, Rui H, Roux B. Elusive intermediate state key in converting ATP hydrolysis to useful work driving the Ca2+ SERCA pump. J Phys Chem B. 2021;125(11):2921-2928. doi: 10.1021/acs.jpcb.1c00558

5. Verhalen B, Dastvan R, Thangapandian S, et al. Energy transduction and alternate access of mammalian ABC transporter P-glycoprotein. Nature. 2017;543(7647):738-741. doi:10.1038/nature21414

6. Kapoor K, Pant S, Tajkhorshid E. Active participation of membrane lipids in inhibition of multidrug transporter P-glycoprotein. Chemology. 2021;12(18):6293-6306. doi: 10.1039/D0SC06288J

7. Phillips JC, Hardy DJ, Maia JDC, et al. Scalable molecular dynamics on CPU and GPU architectures with NAMD. J Chem Phys. 2020;153(4):044130. doi: 10.1063/5.0014475

This article was produced as part of Intel’s editorial program, with the goal of highlighting cutting-edge science, research, and innovation conducted by the HPC and AI communities through cutting-edge technology. The content editor has the final editing rights and determines which articles are published.

About Mariel Baker

Check Also

AMD Releases Latest Consistent Device Memory Mapping Linux Code – Designed for Frontier

Over the past year, we have seen various patches released by AMD engineers with a …