Distributed simulation with DLinAlg¶

Simulation of quantum circuits on a HPC cluster

This notebook aims at introducing how distributed simulations can be run on a HPC cluster in Distributed Qaptiva. In the current version of Distributed Qaptiva, the two available distributed simulators are DLinAlg, a generic simulator based on linear algebra, and DNoisy which extends it to noisy simulations. It can be used to parallelize simulations by distributing the state vector on several compute nodes, enabling the exact simulation of larger circuits with more qubits.

The DLinALg simulator represents pure quantum states with state vectors composed of complex amplitudes, either in double or single precision. This approach demands an amount of memory that grows exponentially with the number of qubits. Each floating-point complex amplitude requires 16 (8) bytes to be stored in double (single) precision, which means that the total memory required to simulate a quantum circuit with nbqbits is $2^{nbqbits+4}$ ($2^{nbqbits+3}$) bytes.

For example, a 40 qubits double precision simulation would require $2^{44}$ bytes, which is 16 TiB. Assuming that the compute nodes have at least 256 GiB of memory each, we will then need 64 compute nodes to perform the distributed simulation.

Example simulation

In order to submit a distributed job on a cluster in a SLURM environment, we will follow the steps described below:

Describe the quantum program in a python script, for example the file qaoa_dlinalg.py in the following cell
Write a SLURM batch script that may contain options preceded with "#SBATCH", to define the resources to be used on the cluster to run the job
Submit the SLURM batch script using the sbatch command
Read the result of the quantum program from the output file generated by the SLURM command

Preparing a python script describing the quantum program

The python script that runs a DLinAlg distributed simulation ressembles scripts employed with other simulators. In the example below, we demonstrate the simulation of a QAOA MaxCut circuit that is distributed to one MPI processes. For simplicity, we will run the algorithm on a simple graph with 15 nodes, resulting in a quantum circuit with 15 qubits. However, it is important to note that this simulator is not intended for such small-scale use cases.

Please note that the DLinAlg distributed simulator can work with or without SLURM. Under a SLURM environment, it will automatically deduce some options available in the environment as default values if they are not passed explicitly, for instance the number of process and threads to run the simulation on. Without SLURM, it will instead use the mpirun command, and all the options have to be passed by the user to the constructor of the simulator. For all the available options, please refer to the documentation of the DLinAlg simulator.

In [1]:

%%writefile qaoa_dlinalg.py

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt

# Specify the graph to run MaxCut algorithm on
graph = nx.full_rary_tree(2, 15)

Writing qaoa_dlinalg.py

The following cell is not required when submitting a real simulation, but it serves as an example to display the graph in this notebook. It is not submitted to the compute nodes, and will run on the node where the notebook is executed.

In [2]:

%run qaoa_dlinalg.py
# Draw the graph - may take time for bigger graphs
nodes_positions = nx.spring_layout(graph, iterations=len(graph.nodes()) * 100)
plt.figure(figsize=(14, 9))
nx.draw_networkx(graph,
                 pos=nodes_positions,
                 node_color='#4EEA6A',
                 node_size=440,
                 font_size=14)
plt.show()

No description has been provided for this image

In [3]:

%%writefile -a qaoa_dlinalg.py

from qat.generators import MaxCutGenerator
from qat.plugins import ScipyMinimizePlugin
from qat.qpus import DLinAlg

# The nb_processes here is set to 1, so the simulation will only run on a single node
# This argument is optional and the number of reserved nodes in the SLURM environment will be used by default
nb_processes = 1
# The temp_dir argument must be configured to point to a writable temporary directory shared by all the nodes in the cluster
# if the directory containing this Python script does not have write permissions or is not shared
temp_dir = "/tmp/"
qpu = DLinAlg(nb_processes=nb_processes, temp_dir=temp_dir)

max_cut_application = (
        MaxCutGenerator(job_type="qaoa")
        | ScipyMinimizePlugin(method="COBYLA", tol=1e-5, options={"maxiter": 20})
        | qpu
)
combinatorial_result = max_cut_application.execute(graph)

print(combinatorial_result.subsets)
print(combinatorial_result.cost)

Appending to qaoa_dlinalg.py

Writing the SLURM batch script to define the resources needed to run the job

In the slurm batch script defined in the following cell, the partition in the cluster to run the job on is not specified, so the default partition as designated by the system administrator will be used. However, this can be added through the --partition option. For this example, we will only use a single node, and 64 cpus in the nodes will be used for the simulation. A time limit can be set on the slurm job through the --time option, and it is recommended to do so. Please refer to the official slurm documentation for additional options and examples.

In [4]:

%%file submit.sh
#!/bin/sh

#SBATCH --nodes 1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=64
#SBATCH --time 00:03:00
#SBATCH --job-name qaoa_dlinalg
#SBATCH --output slurm_output/qaoa_dlinalg.out
#SBATCH --error slurm_output/qaoa_dlinalg.err
#SBATCH --exclusive


# Load the python3 module to be used on the cluster
# module load python3

echo "Loading the qaptiva-hpc module environment"
module load qaptiva-hpc

# Load other modules if necessary such as openmpi
# module load openmpi

echo "Running the DLinAlg simulation"
python3 qaoa_dlinalg.py

Overwriting submit.sh

Submitting the SLURM job on the cluster¶

To submit the simulation job to the compute nodes in the cluster, we simply use the sbatch SLURM command, as shown in the following cell. If we are not in a SLURM environment, we can also use the mpirun command to run the python script, but the resources needed by the simulation job will need to be specified in the command.

In [5]:

%%bash
sbatch --wait --deadline=now+4minutes submit.sh

Submitted batch job 4

Reading the result from the SLURM output file¶

When the simulation job is done, we can read the output file of the SLURM job to retrieve simulation result, or the error file to check if the simulation job is successfully run.

In [6]:

%%bash
cat slurm_output/qaoa_dlinalg.out

Loading the qaptiva-hpc module environment
Running the DLinAlg simulation
[[1, 2, 7, 8, 9, 10, 11, 1

2, 13, 14], [0, 3, 4, 5, 6]]
-14.0