How to run a distributed analog simulation with DMPSTraj¶
Running a Matrix Product State on a HPC cluster
This notebook aims at introducing how distributed matrix product states can be run on a HPC cluster in Qaptiva HPC. In the current version of Qaptiva HPC, there is only one analog simulator DMPSTraj. This distributed version of Matrix-Product States and computations are embarrasingly parallel, it is particularly relevant for distributed computing.
The purpose of this notebook is just to show how to run notebooks and distributed algorithms in a HPC cluster.
Example simulation
In order to submit a distributed job on a cluster in a SLURM environment, we will follow the steps described below:
- Describe the quantum program in a python script, for example the file dmpstraj.py in the following cell
- Write a SLURM batch script that may contain options preceded with "#SBATCH", to define the resources to be used on the cluster to run the job
- Submit the SLURM batch script using the sbatch command
- Read the result of the quantum program from the output file generated by the SLURM command
Toy example for the use of DMPSTraj in a distributed environment¶
One of the main features of Qaptiva HPC is to give the user an almost seemless experience when using distributed environments and distributed algorithms. The small toy example below shows you how to launch a notebook and execute cells within a distributed environment. Please remember that DMPSTraj would only work if SLURM is available on the machine you are using.
In the cell below, the %%writefile dmpstraj.py command indicates that the code within the cell should be copied into a python script named dmpstraj.py.
%%writefile dmpstraj.py
import numpy as np
from qat.qpus import DMPSTraj
from qat.core import Observable, Term, Schedule
from qat.core.variables import Variable, sin
from qat.hardware import HardwareModel
# Define a time Variable
t = Variable("t", float)
# Define the Hamiltonian in a drive to enter the Schedule
alpha_t = t * np.pi # when this is integrated w.r.t time and t=1, we'll get a rotation of np.pi
drive = alpha_t * Observable(1, pauli_terms=[Term(1, "Y", [0])])
schedule = Schedule(drive=drive, tmax=1.0)
# Create a Job - in SAMPLE mode by default
job_sample = schedule.to_job()
# Define a noisy QPU
qpu = DMPSTraj(hardware_model=HardwareModel(jump_operators=[Observable(1, pauli_terms=[(Term(0.6, 'Z', [0]))])]),
n_samples=256, # can specify the number of time steps for the integration
sim_method="number_continuous")
Writing dmpstraj.py
In the following cell, there is the writefile command, but this time -a has been added, the a is for append as it adds code to an already existing python script in this case dmpstraj.py. The following code is then added to dmpstraj.py script.
Important REMARK:DMPSTraj contains a crucial attribute that must be defined everytime the class is to be used. It is n_nodes. Indeed, in order to distribute computations accross the n number of nodes. The reason why it was not defined here, is because it will be done in the sbatch file below.
%%writefile -a dmpstraj.py
# Send the job for simulation and output the results
res = qpu.submit(job_sample)
for sample in res:
print(sample.state, sample.probability)
print(f"Done ! Run time: {res.meta_data['simulation_time']}")
Appending to dmpstraj.py
Once the script is done, now it needs to run on the slurm environment. For this a slurm batch (sbatch) file needs to be created. You can notice the #SBATCH that defines the cluster's resources and environment (number of nodes, output or error files...). These commands are mandatory if you want to run your script in a distributed environment as they allocate resources for computations.
Important REMARK: As mentioned above, the argumet n_nodes is implicitly attributed when defining the number of nodes with the #SBATCH --nodes command. Once the number of nodes is defined using SBATCH it becomes an environment variable that is fetched by DMPSTraj.
%%file submit.sh
#!/bin/sh
#SBATCH --nodes 1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=64
#SBATCH --time 00:02:00
#SBATCH --job-name dmpstraj_example
#SBATCH --output slurm_output/dmpstraj.out
#SBATCH --error slurm_output/dmpstraj.err
#SBATCH --exclusive
# Load the python3 module to be used on the cluster
# module load python3
echo "Loading the qaptiva-hpc module environment"
module load qaptiva-hpc
# Load other modules if necessary such as openmpi
# module load openmpi
echo "Running the DMPSTraj simulation"
python3 dmpstraj.py
Overwriting submit.sh
After writing the submit.sh file, it is possible to execute it via the command line from your notebook. The corresponding cell magic command is %%bash, which indicates that the cell contains a bash command. Below you can see that it is executing the sbatch to execute the submit.sh file. In addition to this there are two more elements specific to running a file on slurm, the --wait and deadline=now+03minutes. These commands allow to wait for a certain time after running the cell. This should give time te run the script and get the outputs.
%%bash
sbatch --wait --deadline=now+03minutes submit.sh
Submitted batch job 2
After executing the cell, the one below is also a bash cell and it prints the output from dmpstraj.out that is in the slurm_output directory. This what the cat command does.
%%bash
cat slurm_output/dmpstraj.out
Loading the qaptiva-hpc module environment Running the DMPSTraj simulation |0> 0.10600249843014219 |
1> 0.8939975015698571 Done ! Run time: 3.1937806606292725