Computational Chemistry using Cloud HPC

How-To

April 22, 2024

High-Performance Computing (HPC) is essential for tasks like battery simulation or drug development modeling in industries like material science and biopharmaceutical research. These fields rely on HPC for its precision and power. However, moving these tasks to the cloud, while beneficial for scalability, introduces challenges in resource management, data handling, and performance consistency. Covalent Cloud offers a fully Pythonic and backend-agnostic approach to creating and managing high-compute workflows, allowing researchers to concentrate on addressing complex scientific problems without getting entangled in the technicalities of cloud computing and script management.

Let’s dive into a computational chemistry example for molecular modeling, specifically calculating the nitrogen bond energy, using Covalent Cloud. Calculating the single point energy is common exercise in material science that also scales well when analyzing other complex systems.

Adapting High Performance Computing Workflows to Cloud

In traditional cloud settings, the initial step often involves creating a Docker image to set up the required environment. This can be a time-consuming and intricate process. Covalent Cloud, however, simplifies this setup significantly.

Computational chemistry using cloud high performance compute and Covalent

Users can create an environment directly from their Python notebook. For example, to install ase and openmpi from conda, all you need to do is:

import covalent_cloud as cc

cc.create_env(name="dft",
              pip=["ase"],
              conda={"channels": ["conda-forge"],
                     "dependencies": ["openmpi", "openssh", "gpaw"]})

Next, we define a Python function to calculate energy.

from gpaw import GPAW
import os

def get_energy(system, calc_kwargs={'xc': 'PBE'}):
    download_and_install_paw_datasets()
    calc = GPAW(**calc_kwargs)
    system.set_calculator(calc)
    return system.get_potential_energy()

To run this function on the cloud HPC, Covalent Cloud provides intuitive primitives to transform the function into an efficient cloud-ready workflow. This is achieved through decorators that wrap individual tasks and combine them into a coherent workflow:

import covalent as ct #import the open source covalent

@ct.lattice(executor=low_compute, workflow_executor=low_compute)
def calculate_energy(systems,executor,calculator_kwargs={}):
    result=[]
		# Convert get_energy into covalent task
    get_energy_electron=ct.electron(get_energy,executor=executor) 
    for system in systems: #simply loop in python to create workflows
        result.append(get_energy_electron(system,calculator_kwargs))
    return result

The decorators ensure that each task, now called an ‘electron’, is adapted for cloud execution and managed by the specified ‘executor’ resource. The beauty of Covalent Cloud lies in its ability to abstract the complexity of cloud resource definition and management, enabling researchers to focus on their core computational tasks. For executing the workflow, Covalent Cloud allows users to define high-performance task-dependent compute resources with ease:

low_compute = cc.CloudExecutor(env="dft", num_cpus=1, memory="1GB", time_limit="1 hour")
high_compute = cc.CloudExecutor(env="dft", num_cpus=32, memory="4GB", time_limit="3 hours")
# or GPUs
high_compute = cc.CloudExecutor(env="dft", num_cpus=32, memory="4GB", gpu_type="h100",num_gpus=8)

This flexibility in specifying resources caters to the varying demands of different tasks in research, which typically would involve allocating and managing various compute queues or bare metal machines on the cloud.

When it’s time to run the calculations, invoking the workflow is as straightforward as:

calc_id = cc.dispatch(calculate_energy)(systems=molecules,
                                        executor=high_compute,
                                        calculator_kwargs=calculation_parameters)

In the background, Covalent Cloud manages resource allocation (across its vast array of clouds), container and data management, and parallel execution while ensuring real-time status updates and logs are accessible via a user interface. This is particularly vital for computations that run for extended periods, where monitoring and managing can be complex. In addition, Covalent Cloud automatically parallelizes independent calculations in the workflow, significantly enhancing efficiency by running them concurrently. For long-duration tasks, researchers have the convenience of asynchronously accessing results from any computer, anytime, using just the dispatch ID.

result = cc.get_result(calc_id, wait=True)
print(f"Total time taken: {(result.end_time - result.start_time).total_seconds() / 60:.3} minutes")
# Total time taken: 1.202 minutes

Results

In this case, completing 30 DFT calculations for nitrogen gas single-point energy in just over a minute showcases Covalent Cloud’s capability to rapidly scale up to nearly 960 compute cores, providing quick access to high-level computing power while paying just for the serverless high compute usage (approximately $2.88 to complete).

This example only scratches the surface of what’s possible with Covalent Cloud. For instance, Covalent Cloud introduces primitives like re-dispatch, allowing users to effortlessly relaunch workflows with their own parameters and resources using just the dispatch ID. This eliminates the need for setting up environments or other preliminary tasks, significantly streamlining the process. Moreover, these workflows can be made available as APIs that are globally accessible and scalable without worrying about infrastructure complexities. With Covalent Cloud, you can also experience the benefits of real-time sharing of workflows. Imagine a scenario where a research leader can monitor the progress of experiments live, facilitating a more integrated and efficient research environment.

In conclusion, while cloud-based High-Performance Computing offers immense benefits, its complexity can often be a barrier. Covalent Cloud addresses this by simplifying the intricacies of cloud HPC. It enables researchers to focus more on their scientific inquiries and less on the underlying computational challenges. With its new paradigm for interacting with cloud HPC, Covalent Cloud provides useful abstractions while still handing comprehensive control to users. This is reshaping the landscape of cloud-based high performance computing, making it more accessible and versatile for the ever-evolving demands of scientific exploration and discovery.

For the full tutorial on how to run material science calculations on the cloud, see our tutorial.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.