pygsti.tools.mpitools
Functions for working with MPI processor distributions
Module Contents
Functions

Partition an array of indices (any type) evenly among comm's processors. 

Partition an array of "indices" evenly among a given number of "processors" 

Divides up slc into num_slices slices. 

Divides up range(start,start+n) into num_slices slices. 

Partition a continuous slice evenly among comm's processors. 

Gathers data within a numpy array, ar_to_fill, according to given slices. 

Gathers data within a numpy array, ar_to_fill, according to given slices. 

Gathers data within a numpy array, ar_to_fill, according to given indices. 

Prepares for one or muliple distributed dot products given the dimensions to be dotted. 

Performs a distributed dot product, dot(a,b). 

Apply a function, f to every element of a list, l in parallel, using MPI. 
Get a comm object 


Sum a value across all processors in comm. 

Find the number of groups to divide nprocs processors into to tackle number_of_tasks tasks. 

Sums arrays across all "owner" processors. 

Returns the divisor of a that is closest to b. 
 pygsti.tools.mpitools.distribute_indices(indices, comm, allow_split_comm=True)
Partition an array of indices (any type) evenly among comm’s processors.
Parameters
 indiceslist
An array of items (any type) which are to be partitioned.
 commmpi4py.MPI.Comm or ResourceAllocation
The communicator which specifies the number of processors and which may be split into returned subcommunicators. If a
ResourceAllocation
object, node information is also taken into account when available (for shared memory compatibility). allow_split_commbool
If True, when there are more processors than indices, multiple processors will be given the same set of local indices and comm will be split into subcommunicators, one for each group of processors that are given the same indices. If False, then “extra” processors are simply given nothing to do, i.e. empty lists of local indices.
Returns
 loc_indiceslist
A list containing the elements of indices belonging to the current processor.
 ownersdict
A dictionary mapping the elements of indices to integer ranks, such that owners[el] gives the rank of the processor responsible for communicating that element’s results to the other processors. Note that in the case when allow_split_comm=True and multiple procesors have computed the results for a given element, only a single (the first) processor rank “owns” the element, and is thus responsible for sharing the results. This notion of ownership is useful when gathering the results.
 loc_commmpi4py.MPI.Comm or ResourceAllocation or None
The local communicator for the group of processors which have been given the same loc_indices to compute, obtained by splitting comm. If loc_indices is unique to the current processor, or if allow_split_comm is False, None is returned.
 pygsti.tools.mpitools.distribute_indices_base(indices, nprocs, rank, allow_split_comm=True)
Partition an array of “indices” evenly among a given number of “processors”
This function is similar to
distribute_indices()
, but allows for more a more generalized notion of what a “processor” is, since the number of processors and rank are given independently and do not have to be associated with an MPI comm. Note also that indices can be an arbitrary list of items, making this function very general.Parameters
 indiceslist
An array of items (any type) which are to be partitioned.
 nprocsint
The number of “processors” to distribute the elements of indices among.
 rankint
The rank of the current “processor” (must be an integer between 0 and nprocs1). Note that this value is not obtained from any MPI communicator.
 allow_split_commbool
If True, when there are more processors than indices, multiple processors will be given the same set of local indices. If False, then extra processors are simply given nothing to do, i.e. empty lists of local indices.
Returns
 loc_indiceslist
A list containing the elements of indices belonging to the current processor (i.e. the one specified by rank).
 ownersdict
A dictionary mapping the elements of indices to integer ranks, such that owners[el] gives the rank of the processor responsible for communicating that element’s results to the other processors. Note that in the case when allow_split_comm=True and multiple procesors have computed the results for a given element, only a single (the first) processor rank “owns” the element, and is thus responsible for sharing the results. This notion of ownership is useful when gathering the results.
 pygsti.tools.mpitools.slice_up_slice(slc, num_slices)
Divides up slc into num_slices slices.
Parameters
 slcslice
The slice to be divided.
 num_slicesint
The number of slices to divide the range into.
Returns
list of slices
 pygsti.tools.mpitools.slice_up_range(n, num_slices, start=0)
Divides up range(start,start+n) into num_slices slices.
Parameters
 nint
The number of (consecutive) indices in the range to be divided.
 num_slicesint
The number of slices to divide the range into.
 startint, optional
The starting entry of the range, so that the range to be divided is range(start,start+n).
Returns
list of slices
 pygsti.tools.mpitools.distribute_slice(s, comm, allow_split_comm=True)
Partition a continuous slice evenly among comm’s processors.
This function is similar to
distribute_indices()
, but is specific to the case when the indices being distributed are a consecutive set of integers (specified by a slice).Parameters
 sslice
The slice to be partitioned.
 commmpi4py.MPI.Comm or ResourceAllocation
The communicator which specifies the number of processors and which may be split into returned subcommunicators. If a
ResourceAllocation
object, node information is also taken into account when available (for shared memory compatibility). allow_split_commbool
If True, when there are more processors than slice indices, multiple processors will be given the same local slice and comm will be split into subcommunicators, one for each group of processors that are given the same local slice. If False, then “extra” processors are simply given nothing to do, i.e. an empty local slice.
Returns
 sliceslist of slices
The list of unique slices assigned to different processors. It’s possible that a single slice (i.e. element of slices) is assigned to multiple processors (when there are more processors than indices in s.
 loc_sliceslice
A slice specifying the indices belonging to the current processor.
 ownersdict
A dictionary giving the owning rank of each slice. Values are integer ranks and keys are integers into slices, specifying which slice.
 loc_commmpi4py.MPI.Comm or ResourceAllocation or None
The local communicator/ResourceAllocation for the group of processors which have been given the same loc_slice to compute, obtained by splitting comm. If loc_slice is unique to the current processor, or if allow_split_comm is False, None is returned.
 pygsti.tools.mpitools.gather_slices(slices, slice_owners, ar_to_fill, ar_to_fill_inds, axes, comm, max_buffer_size=None)
Gathers data within a numpy array, ar_to_fill, according to given slices.
Upon entry it is assumed that the different processors within comm have computed different parts of ar_to_fill, namely different slices of the axisth axis. At exit, data has been gathered such that all processors have the results for the entire ar_to_fill (or at least for all the slices given).
Parameters
 sliceslist
A list of all the slices (computed by any of the processors, not just the current one). Each element of slices may be either a single slice or a tuple of slices (when gathering across multiple dimensions).
 slice_ownersdict
A dictionary mapping the index of a slice (or tuple of slices) within slices to an integer rank of the processor responsible for communicating that slice’s data to the rest of the processors.
 ar_to_fillnumpy.ndarray
The array which contains partial data upon entry and the gathered data upon exit.
 ar_to_fill_indslist
A list of slice or indexarrays specifying the (fixed) subarray of ar_to_fill that should be gathered into. The elements of ar_to_fill_inds are taken to be indices for the leading dimension first, and any unspecified dimensions or None elements are assumed to be unrestricted (as if slice(None,None)). Note that the combination of ar_to_fill and ar_to_fill_inds is essentally like passing ar_to_fill[ar_to_fill_inds] to this function, except it will work with index arrays as well as slices.
 axesint or tuple of ints
The axis or axes of ar_to_fill on which the slices apply (which axis do the slices in slices refer to?). Note that len(axes) must be equal to the number of slices (i.e. the tuple length) of each element of slices.
 commmpi4py.MPI.Comm or ResourceAllocation or None
The communicator specifying the processors involved and used to perform the gather operation. If a
ResourceAllocation
is provided, then interhost communication is used when available to facilitate use of shared intrahost memory. max_buffer_sizeint or None
The maximum buffer size in bytes that is allowed to be used for gathering data. If None, there is no limit.
Returns
None
 pygsti.tools.mpitools.gather_slices_by_owner(current_slices, ar_to_fill, ar_to_fill_inds, axes, comm, max_buffer_size=None)
Gathers data within a numpy array, ar_to_fill, according to given slices.
Upon entry it is assumed that the different processors within comm have computed different parts of ar_to_fill, namely different slices of the axes indexed by axes. At exit, data has been gathered such that all processors have the results for the entire ar_to_fill (or at least for all the slices given).
Parameters
 current_sliceslist
A list of all the slices computed by the current processor. Each element of slices may be either a single slice or a tuple of slices (when gathering across multiple dimensions).
 ar_to_fillnumpy.ndarray
The array which contains partial data upon entry and the gathered data upon exit.
 ar_to_fill_indslist
A list of slice or indexarrays specifying the (fixed) subarray of ar_to_fill that should be gathered into. The elements of ar_to_fill_inds are taken to be indices for the leading dimension first, and any unspecified dimensions or None elements are assumed to be unrestricted (as if slice(None,None)). Note that the combination of ar_to_fill and ar_to_fill_inds is essentally like passing ar_to_fill[ar_to_fill_inds] to this function, except it will work with index arrays as well as slices.
 axesint or tuple of ints
The axis or axes of ar_to_fill on which the slices apply (which axis do the slices in slices refer to?). Note that len(axes) must be equal to the number of slices (i.e. the tuple length) of each element of slices.
 commmpi4py.MPI.Comm or None
The communicator specifying the processors involved and used to perform the gather operation.
 max_buffer_sizeint or None
The maximum buffer size in bytes that is allowed to be used for gathering data. If None, there is no limit.
Returns
None
 pygsti.tools.mpitools.gather_indices(indices, index_owners, ar_to_fill, ar_to_fill_inds, axes, comm, max_buffer_size=None)
Gathers data within a numpy array, ar_to_fill, according to given indices.
Upon entry it is assumed that the different processors within comm have computed different parts of ar_to_fill, namely different slices or indexarrays of the axisth axis. At exit, data has been gathered such that all processors have the results for the entire ar_to_fill (or at least for all the indices given).
Parameters
 indiceslist
A list of all the integerarrays or slices (computed by any of the processors, not just the current one). Each element of indices may be either a single slice/indexarray or a tuple of such elements (when gathering across multiple dimensions).
 index_ownersdict
A dictionary mapping the index of an element within slices to an integer rank of the processor responsible for communicating that slice/indexarray’s data to the rest of the processors.
 ar_to_fillnumpy.ndarray
The array which contains partial data upon entry and the gathered data upon exit.
 ar_to_fill_indslist
A list of slice or indexarrays specifying the (fixed) subarray of ar_to_fill that should be gathered into. The elements of ar_to_fill_inds are taken to be indices for the leading dimension first, and any unspecified dimensions or None elements are assumed to be unrestricted (as if slice(None,None)). Note that the combination of ar_to_fill and ar_to_fill_inds is essentally like passing ar_to_fill[ar_to_fill_inds] to this function, except it will work with index arrays as well as slices.
 axesint or tuple of ints
The axis or axes of ar_to_fill on which the slices apply (which axis do the elements of indices refer to?). Note that len(axes) must be equal to the number of subindices (i.e. the tuple length) of each element of indices.
 commmpi4py.MPI.Comm or None
The communicator specifying the processors involved and used to perform the gather operation.
 max_buffer_sizeint or None
The maximum buffer size in bytes that is allowed to be used for gathering data. If None, there is no limit.
Returns
None
 pygsti.tools.mpitools.distribute_for_dot(a_shape, b_shape, comm)
Prepares for one or muliple distributed dot products given the dimensions to be dotted.
The returned values should be passed as loc_slices to
mpidot()
.Parameters
 a_shape, b_shapetuple
The shapes of the arrays that will be dotted together in ensuing
mpidot()
calls (see above). commmpi4py.MPI.Comm or ResourceAllocation or None
The communicator used to perform the distribution.
Returns
 row_slice, col_sliceslice
The “local” row slice of “A” and column slice of “B” belonging to the current processor, which computes result[row slice, col slice]. These should be passed to
mpidot()
. slice_tuples_by_ranklist
A list of the (row_slice, col_slice) owned by each processor, ordered by rank. If a ResourceAllocation is given that utilizes shared memory, then this list is for the ranks in this processor’s interhost communication group. This should be passed as the slice_tuples_by_rank argument of
mpidot()
.
 pygsti.tools.mpitools.mpidot(a, b, loc_row_slice, loc_col_slice, slice_tuples_by_rank, comm, out=None, out_shm=None)
Performs a distributed dot product, dot(a,b).
Parameters
 anumpy.ndarray
First array to dot together.
 bnumpy.ndarray
Second array to dot together.
 loc_row_slice, loc_col_sliceslice
Specify the row or column indices, respectively, of the resulting dot product that are computed by this processor (the rows of a and columns of b that are used). Obtained from
distribute_for_dot()
. slice_tuples_by_ranklist
A list of (row_slice, col_slice) tuples, one per processor within this processors broadcast group, ordered by rank. Provided by
distribute_for_dot()
. commmpi4py.MPI.Comm or ResourceAllocation or None
The communicator used to parallelize the dot product. If a
ResourceAllocation
object is given, then a shared memory result will be returned when appropriate. outnumpy.ndarray, optional
If not None, the array to use for the result. This should be the same type of array (size, and whether it’s shared or not) as this function would have created if out were None.
 out_shmmultiprocessing.shared_memory.SharedMemory, optinal
The shared memory object corresponding to out when it uses shared memory.
Returns
 resultnumpy.ndarray
The resulting array
 shmmultiprocessing.shared_memory.SharedMemory
A shared memory object needed to cleanup the shared memory. If a normal array is created, this is None. Provide this to
cleanup_shared_ndarray()
to ensure ar is deallocated properly.
 pygsti.tools.mpitools.parallel_apply(f, l, comm)
Apply a function, f to every element of a list, l in parallel, using MPI.
Parameters
 ffunction
function of an item in the list l
 llist
list of items as arguments to f
 commMPI Comm
MPI communicator object for organizing parallel programs
Returns
 resultslist
list of items after f has been applied
 pygsti.tools.mpitools.mpi4py_comm()
Get a comm object
Returns
 MPI.Comm
Comm object to be passed down to parallel pygsti routines
 pygsti.tools.mpitools.sum_across_procs(x, comm)
Sum a value across all processors in comm.
Parameters
 xobject
Local value  the current processor’s contrubution to the sum.
 commmpi4py.MPI.Comm
MPI communicator
Returns
 object
Of the same type as the x objects that were summed.
 pygsti.tools.mpitools.processor_group_size(nprocs, number_of_tasks)
Find the number of groups to divide nprocs processors into to tackle number_of_tasks tasks.
When number_of_tasks > nprocs the smallest integer multiple of nprocs is returned that equals or exceeds number_of_tasks is returned.
When number_of_tasks < nprocs the smallest divisor of nprocs that equals or exceeds number_of_tasks is returned.
Parameters
 nprocsint
The number of processors to divide into groups.
 number_of_tasksint or float
The number of tasks to perform, which can also be seen as the desired number of processor groups. If a floating point value is given the next highest integer is used.
Returns
int
 pygsti.tools.mpitools.sum_arrays(local_array, owners, comm)
Sums arrays across all “owner” processors.
Parameters
 local_arraynumpy.ndarray
The array contributed by this processor. This array will be zeroed out on processors whose ranks are not in owners.
 ownerslist or set
The ranks whose contributions should be summed. These are the ranks of the processors that “own” the responsibility to communicate their local array to the rest of the processors.
 commmpi4py.MPI.Comm
MPI communicator
Returns
 numpy.ndarray
The summed local arrays.