pygsti.optimize

pyGSTi Optimization Python Package

Submodules

Package Contents

Classes

ArraysInterface

An interface between pyGSTi's optimization methods and data storage arrays.

UndistributedArraysInterface

An arrays interface for the case when the arrays are not actually distributed.

DistributedArraysInterface

An arrays interface where the arrays are distributed according to a distributed layout.

OptimizerResult

The result from an optimization.

Optimizer

An optimizer. Optimizes an objective function.

CustomLMOptimizer

A Levenberg-Marquardt optimizer customized for GST-like problems.

Functions

custom_leastsq(obj_fn, jac_fn, x0[, f_norm2_tol, ...])

An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi.

custom_solve(a, b, x, ari, resource_alloc[, ...])

Simple parallel Gaussian Elimination with pivoting.

fmax_cg(f, x0[, maxiters, tol, dfdx_and_bdflag, xopt])

Custom conjugate-gradient (CG) routine for maximizing a function.

minimize(fn, x0[, method, callback, tol, maxiter, ...])

Minimizes the function fn starting at x0.

create_objfn_printer(obj_func[, start_time])

Create a callback function that prints the value of an objective function.

check_jac(f, x0, jac_to_check[, eps, tol, err_type, ...])

Checks a jacobian function using finite differences.

optimize_wildcard_budget_neldermead(budget, L1weights, ...)

Uses repeated Nelder-Mead to optimize the wildcard budget.

optimize_wildcard_budget_percircuit_only_cvxpy(budget, ...)

Uses CVXPY to optimize the wildcard budget. Includes only per-circuit constraints.

optimize_wildcard_bisect_alpha(budget, objfn, ...[, ...])

optimize_wildcard_budget_cvxopt(budget, L1weights, ...)

Uses CVXOPT to optimize the wildcard budget. Includes both aggregate and per-circuit constraints.

optimize_wildcard_budget_cvxopt_zeroreg(budget, ...[, ...])

Adds regularization of the L1 term around zero values of the budget. This doesn't seem to help much.

optimize_wildcard_budget_barrier(budget, L1weights, ...)

Uses a barrier method (for convex optimization) to optimize the wildcard budget.

NewtonSolve(initial_x, fn[, fn_with_derivs, dx_tol, ...])

optimize_wildcard_budget_cvxopt_smoothed(budget, ...)

Uses a smooted version of the objective function. Doesn't seem to help much.

class pygsti.optimize.ArraysInterface

Bases: object

An interface between pyGSTi’s optimization methods and data storage arrays.

This class provides an abstract interface to algorithms (particularly the Levenberg-Marquardt nonlinear least-squares algorithm) for creating an manipulating potentially distributed data arrays with types such as “jtj” (Jacobian^T * Jacobian), “jtf” (Jacobian^T * objectivefn_vector), and “x” (model parameter vector). The class encapsulates all the operations on these arrays so that the algorithm doesn’t need to worry about how the arrays are actually stored in memory, e.g. whether shared memory is used or not.

class pygsti.optimize.UndistributedArraysInterface(num_global_elements, num_global_params)

Bases: ArraysInterface

An arrays interface for the case when the arrays are not actually distributed.

Parameters

num_global_elementsint

The total number of objective function “elements”, i.e. the size of the objective function array f.

num_global_paramsint

The total number of (model) parameters, i.e. the size of the x array.

allocate_jtf()

Allocate an array for holding a ‘jtf’-type value.

Returns

numpy.ndarray or LocalNumpyArray

allocate_jtj()

Allocate an array for holding an approximated Hessian (type ‘jtj’).

Returns

numpy.ndarray or LocalNumpyArray

allocate_jac()

Allocate an array for holding a Jacobian matrix (type ‘ep’).

Returns

numpy.ndarray or LocalNumpyArray

deallocate_jtf(jtf)

Free an array for holding an objective function value (type ‘jtf’).

Returns

None

deallocate_jtj(jtj)

Free an array for holding an approximated Hessian (type ‘jtj’).

Returns

None

deallocate_jac(jac)

Free an array for holding a Jacobian matrix (type ‘ep’).

Returns

None

global_num_elements()

The total number of objective function “elements”.

This is the size/length of the objective function f vector.

Returns

int

jac_param_slice(only_if_leader=False)

The slice into a Jacobian’s columns that belong to this processor.

Parameters
only_if_leaderbool, optional

If True, the current processor’s parameter slice is ony returned if the processor is the “leader” (i.e. the first) of the processors that calculate the same parameter slice. All non-leader processors return the zero-slice slice(0,0).

Returns

slice

jtf_param_slice()

The slice into a ‘jtf’ vector giving the rows of owned by this processor.

Returns

slice

param_fine_info()

Returns information regarding how model parameters are distributed among hosts and processors.

This information relates to the “fine” distribution used in distributed layouts, and is needed by some algorithms which utilize shared-memory communication between processors on the same host.

Returns
param_fine_slices_by_hostlist

A list with one entry per host. Each entry is itself a list of (rank, (global_param_slice, host_param_slice)) elements where rank is the top-level overall rank of a processor, global_param_slice is the parameter slice that processor owns and host_param_slice is the same slice relative to the parameters owned by the host.

owner_host_and_rank_of_global_fine_param_indexdict

A mapping between parameter indices (keys) and the owning processor rank and host index. Values are (host_index, processor_rank) tuples.

allgather_x(x, global_x)

Gather a parameter (x) vector onto all the processors.

Parameters
xnumpy.array or LocalNumpyArray

The input vector.

global_xnumpy.array or LocalNumpyArray

The output (gathered) vector.

Returns

None

allscatter_x(global_x, x)

Pare down an already-scattered global parameter (x) vector to be just a local x vector.

Parameters
global_xnumpy.array or LocalNumpyArray

The input vector. This global vector is already present on all the processors, so there’s no need to do any MPI communication.

xnumpy.array or LocalNumpyArray

The output vector, typically a slice of global_x.

Returns

None

scatter_x(global_x, x)

Scatter a global parameter (x) vector onto all the processors.

Parameters
global_xnumpy.array or LocalNumpyArray

The input vector.

xnumpy.array or LocalNumpyArray

The output (scattered) vector.

Returns

None

allgather_f(f, global_f)

Gather an objective funtion (f) vector onto all the processors.

Parameters
fnumpy.array or LocalNumpyArray

The input vector.

global_fnumpy.array or LocalNumpyArray

The output (gathered) vector.

Returns

None

gather_jtj(jtj, return_shared=False)

Gather a Hessian (jtj) matrix onto the root processor.

Parameters
jtjnumpy.array or LocalNumpyArray

The (local) input matrix to gather.

return_sharedbool, optional

Whether the returned array is allowed to be a shared-memory array, which results in a small performance gain because the array used internally to gather the results can be returned directly. When True a shared memory handle is also returned, and the caller assumes responsibilty for freeing the memory via pygsti.tools.sharedmemtools.cleanup_shared_ndarray().

Returns
gathered_arraynumpy.ndarray or None

The full (global) output array on the root (rank=0) processor and None on all other processors.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

Returned only when return_shared == True. The shared memory handle associated with gathered_array, which is needed to free the memory.

scatter_jtj(global_jtj, jtj)

Scatter a Hessian (jtj) matrix onto all the processors.

Parameters
global_jtjnumpy.ndarray

The global Hessian matrix to scatter.

jtjnumpy.ndarray or LocalNumpyArray

The local destination array.

Returns

None

gather_jtf(jtf, return_shared=False)

Gather a jtf vector onto the root processor.

Parameters
jtfnumpy.array or LocalNumpyArray

The local input vector to gather.

return_sharedbool, optional

Whether the returned array is allowed to be a shared-memory array, which results in a small performance gain because the array used internally to gather the results can be returned directly. When True a shared memory handle is also returned, and the caller assumes responsibilty for freeing the memory via pygsti.tools.sharedmemtools.cleanup_shared_ndarray().

Returns
gathered_arraynumpy.ndarray or None

The full (global) output array on the root (rank=0) processor and None on all other processors.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

Returned only when return_shared == True. The shared memory handle associated with gathered_array, which is needed to free the memory.

scatter_jtf(global_jtf, jtf)

Scatter a jtf vector onto all the processors.

Parameters
global_jtfnumpy.ndarray

The global vector to scatter.

jtfnumpy.ndarray or LocalNumpyArray

The local destination array.

Returns

None

global_svd_dot(jac_v, minus_jtf)

Gathers the dot product between a jtj-type matrix and a jtf-type vector into a global result array.

This is typically used within SVD-defined basis calculations, where jac_v is the “V” matrix of the SVD of a jacobian, and minus_jtf is the negative dot product between the Jacobian matrix and objective function vector.

Parameters
jac_vnumpy.ndarray or LocalNumpyArray

An array of jtj-type.

minus_jtfnumpy.ndarray or LocalNumpyArray

An array of jtf-type.

Returns
numpy.ndarray

The global (gathered) parameter vector dot(jac_v.T, minus_jtf).

fill_dx_svd(jac_v, global_vec, dx)

Computes the dot product of a jtj-type array with a global parameter array.

The result (dx) is a jtf-type array. This is typically used for computing the x-update vector in the LM method when using a SVD-defined basis.

Parameters
jac_vnumpy.ndarray or LocalNumpyArray

An array of jtj-type.

global_vecnumpy.ndarray

A global parameter vector.

dxnumpy.ndarray or LocalNumpyArray

An array of jtf-type. Filled with dot(jac_v, global_vec) values.

Returns

None

dot_x(x1, x2)

Take the dot product of two x-type vectors.

Parameters
x1, x2numpy.ndarray or LocalNumpyArray

The vectors to operate on.

Returns

float

norm2_x(x)

Compute the Frobenius norm squared of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

infnorm_x(x)

Compute the infinity-norm of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

max_x(x)

Compute the maximum of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

norm2_f(f)

Compute the Frobenius norm squared of an f-type vector.

Parameters
fnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

norm2_jtj(jtj)

Compute the Frobenius norm squared of an jtj-type matrix.

Parameters
jtjnumpy.ndarray or LocalNumpyArray

The array to operate on.

Returns

float

norm2_jac(j)

Compute the Frobenius norm squared of an Jacobian matrix (ep-type).

Parameters
jnumpy.ndarray or LocalNumpyArray

The Jacobian to operate on.

Returns

float

fill_jtf(j, f, jtf)

Compute dot(Jacobian.T, f) in supplied memory.

Parameters
jnumpy.ndarray or LocalNumpyArray

Jacobian matrix (type ep).

fnumpy.ndarray or LocalNumpyArray

Objective function vector (type e).

jtfnumpy.ndarray or LocalNumpyArray

Output array, type jtf. Filled with dot(j.T, f) values.

Returns

None

fill_jtj(j, jtj, shared_mem_buf=None)

Compute dot(Jacobian.T, Jacobian) in supplied memory.

Parameters
jnumpy.ndarray or LocalNumpyArray

Jacobian matrix (type ep).

jtfnumpy.ndarray or LocalNumpyArray

Output array, type jtj. Filled with dot(j.T, j) values.

shared_mem_buftuple or None

Scratch space of shared memory used to speed up repeated calls to fill_jtj. If not none, the value returned from allocate_jtj_shared_mem_buf().

Returns

None

allocate_jtj_shared_mem_buf()

Allocate scratch space to be used for repeated calls to fill_jtj().

Returns
scratchnumpy.ndarray or None

The scratch array.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

The shared memory handle associated with scratch, which is needed to free the memory.

deallocate_jtj_shared_mem_buf(jtj_buf)

Frees the scratch memory allocated by allocate_jtj_shared_mem_buf().

Parameters
jtj_buftuple or None

The value returned from allocate_jtj_shared_mem_buf()

jtj_diag_indices(jtj)

The indices into a jtj-type array that correspond to diagonal elements of the global matrix.

If jtj were a global quantity, then this would just be numpy.diag_indices_from(jtj), however, it may be more complicated in actuality when different processors hold different sections of the global matrix.

Parameters
jtjnumpy.ndarray or None

The jtj-type array to get the indices with respect to.

Returns
tuple

A tuple of 1D arrays that can be used to index the elements of jtj that correspond to diagonal elements of the global jtj matrix.

class pygsti.optimize.DistributedArraysInterface(dist_layout, lsvec_mode, extra_elements=0)

Bases: ArraysInterface

An arrays interface where the arrays are distributed according to a distributed layout.

Parameters

dist_layoutDistributableCOPALayout

The layout giving the distribution of the arrays.

extra_elementsint, optional

The number of additional objective function “elements” beyond those specified by dist_layout. These are often used for penalty terms.

allocate_jtf()

Allocate an array for holding a ‘jtf’-type value.

Returns

numpy.ndarray or LocalNumpyArray

allocate_jtj()

Allocate an array for holding an approximated Hessian (type ‘jtj’).

Returns

numpy.ndarray or LocalNumpyArray

allocate_jac()

Allocate an array for holding a Jacobian matrix (type ‘ep’).

Returns

numpy.ndarray or LocalNumpyArray

deallocate_jtf(jtf)

Free an array for holding an objective function value (type ‘jtf’).

Returns

None

deallocate_jtj(jtj)

Free an array for holding an approximated Hessian (type ‘jtj’).

Returns

None

deallocate_jac(jac)

Free an array for holding a Jacobian matrix (type ‘ep’).

Returns

None

global_num_elements()

The total number of objective function “elements”.

This is the size/length of the objective function f vector.

Returns

int

jac_param_slice(only_if_leader=False)

The slice into a Jacobian’s columns that belong to this processor.

Parameters
only_if_leaderbool, optional

If True, the current processor’s parameter slice is ony returned if the processor is the “leader” (i.e. the first) of the processors that calculate the same parameter slice. All non-leader processors return the zero-slice slice(0,0).

Returns

slice

jtf_param_slice()

The slice into a ‘jtf’ vector giving the rows of owned by this processor.

Returns

slice

param_fine_info()

Returns information regarding how model parameters are distributed among hosts and processors.

This information relates to the “fine” distribution used in distributed layouts, and is needed by some algorithms which utilize shared-memory communication between processors on the same host.

Returns
param_fine_slices_by_hostlist

A list with one entry per host. Each entry is itself a list of (rank, (global_param_slice, host_param_slice)) elements where rank is the top-level overall rank of a processor, global_param_slice is the parameter slice that processor owns and host_param_slice is the same slice relative to the parameters owned by the host.

owner_host_and_rank_of_global_fine_param_indexdict

A mapping between parameter indices (keys) and the owning processor rank and host index. Values are (host_index, processor_rank) tuples.

allgather_x(x, global_x)

Gather a parameter (x) vector onto all the processors.

Parameters
xnumpy.array or LocalNumpyArray

The input vector.

global_xnumpy.array or LocalNumpyArray

The output (gathered) vector.

Returns

None

allscatter_x(global_x, x)

Pare down an already-scattered global parameter (x) vector to be just a local x vector.

Parameters
global_xnumpy.array or LocalNumpyArray

The input vector. This global vector is already present on all the processors, so there’s no need to do any MPI communication.

xnumpy.array or LocalNumpyArray

The output vector, typically a slice of global_x.

Returns

None

scatter_x(global_x, x)

Scatter a global parameter (x) vector onto all the processors.

Parameters
global_xnumpy.array or LocalNumpyArray

The input vector.

xnumpy.array or LocalNumpyArray

The output (scattered) vector.

Returns

None

allgather_f(f, global_f)

Gather an objective funtion (f) vector onto all the processors.

Parameters
fnumpy.array or LocalNumpyArray

The input vector.

global_fnumpy.array or LocalNumpyArray

The output (gathered) vector.

Returns

None

gather_jtj(jtj, return_shared=False)

Gather a Hessian (jtj) matrix onto the root processor.

Parameters
jtjnumpy.array or LocalNumpyArray

The (local) input matrix to gather.

return_sharedbool, optional

Whether the returned array is allowed to be a shared-memory array, which results in a small performance gain because the array used internally to gather the results can be returned directly. When True a shared memory handle is also returned, and the caller assumes responsibilty for freeing the memory via pygsti.tools.sharedmemtools.cleanup_shared_ndarray().

Returns
gathered_arraynumpy.ndarray or None

The full (global) output array on the root (rank=0) processor and None on all other processors.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

Returned only when return_shared == True. The shared memory handle associated with gathered_array, which is needed to free the memory.

scatter_jtj(global_jtj, jtj)

Scatter a Hessian (jtj) matrix onto all the processors.

Parameters
global_jtjnumpy.ndarray

The global Hessian matrix to scatter.

jtjnumpy.ndarray or LocalNumpyArray

The local destination array.

Returns

None

gather_jtf(jtf, return_shared=False)

Gather a jtf vector onto the root processor.

Parameters
jtfnumpy.array or LocalNumpyArray

The local input vector to gather.

return_sharedbool, optional

Whether the returned array is allowed to be a shared-memory array, which results in a small performance gain because the array used internally to gather the results can be returned directly. When True a shared memory handle is also returned, and the caller assumes responsibilty for freeing the memory via pygsti.tools.sharedmemtools.cleanup_shared_ndarray().

Returns
gathered_arraynumpy.ndarray or None

The full (global) output array on the root (rank=0) processor and None on all other processors.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

Returned only when return_shared == True. The shared memory handle associated with gathered_array, which is needed to free the memory.

scatter_jtf(global_jtf, jtf)

Scatter a jtf vector onto all the processors.

Parameters
global_jtfnumpy.ndarray

The global vector to scatter.

jtfnumpy.ndarray or LocalNumpyArray

The local destination array.

Returns

None

global_svd_dot(jac_v, minus_jtf)

Gathers the dot product between a jtj-type matrix and a jtf-type vector into a global result array.

This is typically used within SVD-defined basis calculations, where jac_v is the “V” matrix of the SVD of a jacobian, and minus_jtf is the negative dot product between the Jacobian matrix and objective function vector.

Parameters
jac_vnumpy.ndarray or LocalNumpyArray

An array of jtj-type.

minus_jtfnumpy.ndarray or LocalNumpyArray

An array of jtf-type.

Returns
numpy.ndarray

The global (gathered) parameter vector dot(jac_v.T, minus_jtf).

fill_dx_svd(jac_v, global_vec, dx)

Computes the dot product of a jtj-type array with a global parameter array.

The result (dx) is a jtf-type array. This is typically used for computing the x-update vector in the LM method when using a SVD-defined basis.

Parameters
jac_vnumpy.ndarray or LocalNumpyArray

An array of jtj-type.

global_vecnumpy.ndarray

A global parameter vector.

dxnumpy.ndarray or LocalNumpyArray

An array of jtf-type. Filled with dot(jac_v, global_vec) values.

Returns

None

dot_x(x1, x2)

Take the dot product of two x-type vectors.

Parameters
x1, x2numpy.ndarray or LocalNumpyArray

The vectors to operate on.

Returns

float

norm2_x(x)

Compute the Frobenius norm squared of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

infnorm_x(x)

Compute the infinity-norm of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

min_x(x)

Compute the minimum of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

max_x(x)

Compute the maximum of an x-type vector.

Parameters
xnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

norm2_f(f)

Compute the Frobenius norm squared of an f-type vector.

Parameters
fnumpy.ndarray or LocalNumpyArray

The vector to operate on.

Returns

float

norm2_jac(j)

Compute the Frobenius norm squared of an Jacobian matrix (ep-type).

Parameters
jnumpy.ndarray or LocalNumpyArray

The Jacobian to operate on.

Returns

float

norm2_jtj(jtj)

Compute the Frobenius norm squared of an jtj-type matrix.

Parameters
jtjnumpy.ndarray or LocalNumpyArray

The array to operate on.

Returns

float

fill_jtf(j, f, jtf)

Compute dot(Jacobian.T, f) in supplied memory.

Parameters
jnumpy.ndarray or LocalNumpyArray

Jacobian matrix (type ep).

fnumpy.ndarray or LocalNumpyArray

Objective function vector (type e).

jtfnumpy.ndarray or LocalNumpyArray

Output array, type jtf. Filled with dot(j.T, f) values.

Returns

None

fill_jtj(j, jtj, shared_mem_buf=None)

Compute dot(Jacobian.T, Jacobian) in supplied memory.

Parameters
jnumpy.ndarray or LocalNumpyArray

Jacobian matrix (type ep).

jtfnumpy.ndarray or LocalNumpyArray

Output array, type jtj. Filled with dot(j.T, j) values.

shared_mem_buftuple or None

Scratch space of shared memory used to speed up repeated calls to fill_jtj. If not none, the value returned from allocate_jtj_shared_mem_buf().

Returns

None

allocate_jtj_shared_mem_buf()

Allocate scratch space to be used for repeated calls to fill_jtj().

Returns
scratchnumpy.ndarray or None

The scratch array.

shared_memory_handlemultiprocessing.shared_memory.SharedMemory or None

The shared memory handle associated with scratch, which is needed to free the memory.

deallocate_jtj_shared_mem_buf(jtj_buf)

Frees the scratch memory allocated by allocate_jtj_shared_mem_buf().

Parameters
jtj_buftuple or None

The value returned from allocate_jtj_shared_mem_buf()

jtj_diag_indices(jtj)

The indices into a jtj-type array that correspond to diagonal elements of the global matrix.

If jtj were a global quantity, then this would just be numpy.diag_indices_from(jtj), however, it may be more complicated in actuality when different processors hold different sections of the global matrix.

Parameters
jtjnumpy.ndarray or None

The jtj-type array to get the indices with respect to.

Returns
tuple

A tuple of 1D arrays that can be used to index the elements of jtj that correspond to diagonal elements of the global jtj matrix.

class pygsti.optimize.OptimizerResult(objective_func, opt_x, opt_f=None, opt_jtj=None, opt_unpenalized_f=None, chi2_k_distributed_qty=None, optimizer_specific_qtys=None)

Bases: object

The result from an optimization.

Parameters

objective_funcObjectiveFunction

The objective function that was optimized.

opt_xnumpy.ndarray

The optimal argument (x) value. Often a vector of parameters.

opt_fnumpy.ndarray

the optimal objective function (f) value. Often this is the least-squares vector of objective function values.

opt_jtjnumpy.ndarray, optional

the optimial dot(transpose(J),J) value, where J is the Jacobian matrix. This may be useful for computing approximate error bars.

opt_unpenalized_fnumpy.ndarray, optional

the optimal objective function (f) value with any penalty terms removed.

chi2_k_distributed_qtyfloat, optional

a value that is supposed to be chi2_k distributed.

optimizer_specific_qtysdict, optional

a dictionary of additional optimization parameters.

class pygsti.optimize.Optimizer

Bases: pygsti.baseobjs.nicelyserializable.NicelySerializable

An optimizer. Optimizes an objective function.

classmethod cast(obj)

Cast obj to a Optimizer.

If obj is already an Optimizer it is just returned, otherwise this function tries to create a new object using obj as a dictionary of constructor arguments.

Parameters

objOptimizer or dict

The object to cast.

Returns

Optimizer

class pygsti.optimize.CustomLMOptimizer(maxiter=100, maxfev=100, tol=1e-06, fditer=0, first_fditer=0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, serial_solve_proc_threshold=100, lsvec_mode='normal')

Bases: Optimizer

A Levenberg-Marquardt optimizer customized for GST-like problems.

Parameters

maxiterint, optional

The maximum number of (outer) interations.

maxfevint, optional

The maximum function evaluations.

tolfloat or dict, optional

The tolerance, specified as a single float or as a dict with keys {‘relx’, ‘relf’, ‘jac’, ‘maxdx’}. A single float sets the ‘relf’ and ‘jac’ elemments and leaves the others at their default values.

fditerint optional

Internally compute the Jacobian using a finite-difference method for the first fditer iterations. This is useful when the initial point lies at a special or singular point where the analytic Jacobian is misleading.

first_fditerint, optional

Number of finite-difference iterations applied to the first stage of the optimization (only). Unused.

damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}

How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.

damping_basis{‘diagonal_values’, ‘singular_values’}

Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).

damping_cliptuple, optional

A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.

use_accelerationbool, optional

Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.

uphill_step_thresholdfloat, optional

Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.

init_munutuple, optional

If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.

oob_check_intervalint, optional

Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.

oob_action{“reject”,”stop”}

What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.

oob_check_modeint, optional

An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.

serial_solve_proc_thresholdint, optional

When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.

lsvec_mode{‘normal’, ‘percircuit’}

Whether the terms used in the least-squares optimization are the “elements” as computed by the objective function’s .terms() and .lsvec() methods (‘normal’ mode) or the “per-circuit quantities” computed by the objective function’s .percircuit() and .lsvec_percircuit() methods (‘percircuit’ mode).

run(objective, profiler, printer)

Perform the optimization.

Parameters
objectiveObjectiveFunction

The objective function to optimize.

profilerProfiler

A profiler to track resource usage.

printerVerbosityPrinter

printer to use for sending output to stdout.

pygsti.optimize.custom_leastsq(obj_fn, jac_fn, x0, f_norm2_tol=1e-06, jac_norm_tol=1e-06, rel_ftol=1e-06, rel_xtol=1e-06, max_iter=100, num_fd_iters=0, max_dx_scale=1.0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, resource_alloc=None, arrays_interface=None, serial_solve_proc_threshold=100, x_limits=None, verbosity=0, profiler=None)

An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi.

This general purpose routine mimic to a large extent the interface used by scipy.optimize.leastsq, though it implements a newer (and more robust) version of the algorithm.

Parameters

obj_fnfunction

The objective function. Must accept and return 1D numpy ndarrays of length N and M respectively. Same form as scipy.optimize.leastsq.

jac_fnfunction

The jacobian function (not optional!). Accepts a 1D array of length N and returns an array of shape (M,N).

x0numpy.ndarray

Initial evaluation point.

f_norm2_tolfloat, optional

Tolerace for F^2 where F = `norm( sum(obj_fn(x)**2) ) is the least-squares residual. If F**2 < f_norm2_tol, then mark converged.

jac_norm_tolfloat, optional

Tolerance for jacobian norm, namely if infn(dot(J.T,f)) < jac_norm_tol then mark converged, where infn is the infinity-norm and f = obj_fn(x).

rel_ftolfloat, optional

Tolerance on the relative reduction in F^2, that is, if d(F^2)/F^2 < rel_ftol then mark converged.

rel_xtolfloat, optional

Tolerance on the relative value of |x|, so that if d(|x|)/|x| < rel_xtol then mark converged.

max_iterint, optional

The maximum number of (outer) interations.

num_fd_itersint optional

Internally compute the Jacobian using a finite-difference method for the first num_fd_iters iterations. This is useful when x0 lies at a special or singular point where the analytic Jacobian is misleading.

max_dx_scalefloat, optional

If not None, impose a limit on the magnitude of the step, so that |dx|^2 < max_dx_scale^2 * len(dx) (so elements of dx should be, roughly, less than max_dx_scale).

damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}

How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.

damping_basis{‘diagonal_values’, ‘singular_values’}

Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).

damping_cliptuple, optional

A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.

use_accelerationbool, optional

Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.

uphill_step_thresholdfloat, optional

Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.

init_munutuple, optional

If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.

oob_check_intervalint, optional

Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.

oob_action{“reject”,”stop”}

What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.

oob_check_modeint, optional

An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.

resource_allocResourceAllocation, optional

When not None, an resource allocation object used for distributing the computation across multiple processors.

arrays_interfaceArraysInterface

An object that provides an interface for creating and manipulating data arrays.

serial_solve_proc_thresholdint optional

When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.

x_limitsnumpy.ndarray, optional

A (num_params, 2)-shaped array, holding on each row the (min, max) values for the corresponding parameter (element of the “x” vector). If None, then no limits are imposed.

verbosityint, optional

Amount of detail to print to stdout.

profilerProfiler, optional

A profiler object used for to track timing and memory usage.

Returns

xnumpy.ndarray

The optimal solution.

convergedbool

Whether the solution converged.

msgstr

A message indicating why the solution converged (or didn’t).

pygsti.optimize.custom_solve(a, b, x, ari, resource_alloc, proc_threshold=100)

Simple parallel Gaussian Elimination with pivoting.

This function was built to provide a parallel alternative to scipy.linalg.solve, and can achieve faster runtimes compared with the serial SciPy routine when the number of available processors and problem size are large enough.

When the number of processors is greater than proc_threshold (below this number the routine just calls scipy.linalg.solve on the root processor) the method works as follows:

  • each processor “owns” some subset of the rows of a and b.

  • iteratively (over pivot columns), the best pivot row is found, and this row is used to eliminate all other elements in the current pivot column. This procedure operations on the joined matrix a|b, and when it completes the matrix a is in reduced row echelon form (RREF).

  • back substitution (trivial because a is in reduced REF) is performed to find the solution x such that a @ x = b.

Parameters

aLocalNumpyArray

A 2D array with the ‘jtj’ distribution, holding the rows of the a matrix belonging to the current processor. (This belonging is dictated by the “fine” distribution in a distributed layout.)

bLocalNumpyArray

A 1D array with the ‘jtf’ distribution, holding the rows of the b vector belonging to the current processor.

xLocalNumpyArray

A 1D array with the ‘jtf’ distribution, holding the rows of the x vector belonging to the current processor. This vector is filled by this function.

ariArraysInterface

An object that provides an interface for creating and manipulating data arrays.

resource_allocResourceAllocation

Gives the resources (e.g., processors and memory) available for use.

proc_thresholdint, optional

Below this number of processors this routine will simply gather a and b to a single (the rank 0) processor, call SciPy’s serial linear solver, scipy.linalg.solve, and scatter the results back onto all the processors.

Returns

None

pygsti.optimize.fmax_cg(f, x0, maxiters=100, tol=1e-08, dfdx_and_bdflag=None, xopt=None)

Custom conjugate-gradient (CG) routine for maximizing a function.

This function runs slower than scipy.optimize’s ‘CG’ method, but doesn’t give up or get stuck as easily, and so sometimes can be a better option.

Parameters

ffunction

The function to optimize

x0numpy array

The starting point (argument to fn).

maxitersint, optional

Maximum iterations.

tolfloat, optional

Tolerace for convergence (compared to absolute difference in f)

dfdx_and_bdflagfunction, optional

Function to compute jacobian of f as well as a boundary-flag.

xoptnumpy array, optional

Used for debugging, output can be printed relating current optimum relative xopt, assumed to be a known good optimum.

Returns

scipy.optimize.Result object

Includes members ‘x’, ‘fun’, ‘success’, and ‘message’. Note: returns the negated maximum in ‘fun’ in order to conform to the return value of other minimization routines.

pygsti.optimize.minimize(fn, x0, method='cg', callback=None, tol=1e-10, maxiter=1000000, maxfev=None, stopval=None, jac=None, verbosity=0, **addl_kwargs)

Minimizes the function fn starting at x0.

This is a gateway function to all other minimization routines within this module, providing a common interface to many different minimization methods (including and extending beyond those available from scipy.optimize).

Parameters

fnfunction

The function to minimize.

x0numpy array

The starting point (argument to fn).

methodstring, optional

Which minimization method to use. Allowed values are: “simplex” : uses _fmin_simplex “supersimplex” : uses _fmin_supersimplex “customcg” : uses fmax_cg (custom CG method) “brute” : uses scipy.optimize.brute “basinhopping” : uses scipy.optimize.basinhopping with L-BFGS-B “swarm” : uses _fmin_particle_swarm “evolve” : uses _fmin_evolutionary (which uses DEAP) < methods available from scipy.optimize.minimize >

callbackfunction, optional

A callback function to be called in order to track optimizer progress. Should have signature: myCallback(x, f=None, accepted=None). Note that create_objfn_printer(…) function can be used to create a callback.

tolfloat, optional

Tolerance value used for all types of tolerances available in a given method.

maxiterint, optional

Maximum iterations.

maxfevint, optional

Maximum function evaluations; used only when available, and defaults to maxiter.

stopvalfloat, optional

For basinhopping method only. When f <= stopval then basinhopping outer loop will terminate. Useful when a bound on the minimum is known.

jacfunction

Jacobian function.

verbosityint

Level of detail to print to stdout.

addl_kwargsdict

Additional arguments for the specific optimizer being used.

Returns

scipy.optimize.Result object

Includes members ‘x’, ‘fun’, ‘success’, and ‘message’.

pygsti.optimize.create_objfn_printer(obj_func, start_time=None)

Create a callback function that prints the value of an objective function.

Parameters

obj_funcfunction

The objective function to print.

start_timefloat , optional

A reference starting time to use when printing elapsed times. If None, then the system time when this function is called is used (which is often what you want).

Returns

function

A callback function which prints obj_func.

pygsti.optimize.check_jac(f, x0, jac_to_check, eps=1e-10, tol=1e-06, err_type='rel', verbosity=1)

Checks a jacobian function using finite differences.

Parameters

ffunction

The function to check.

x0numpy array

The point at which to check the jacobian.

jac_to_checkfunction

A function which should compute the jacobian of f at x0.

epsfloat, optional

Epsilon to use in finite difference calculations of jacobian.

tolfloat, optional

The allowd tolerance on the relative differene between the values of the finite difference and jac_to_check jacobians if err_type == ‘rel’ or the absolute difference if err_type == ‘abs’.

err_type{‘rel’, ‘abs’), optional

How to interpret tol (see above).

verbosityint, optional

Controls how much detail is printed to stdout.

Returns

errSumfloat

The total error between the jacobians.

errslist

List of (row,col,err) tuples giving the error for each row and column.

ffd_jacnumpy array

The computed forward-finite-difference jacobian.

pygsti.optimize.optimize_wildcard_budget_neldermead(budget, L1weights, wildcard_objfn, two_dlogl_threshold, redbox_threshold, printer, smart_init=True, max_outer_iters=10, initial_eta=10.0)

Uses repeated Nelder-Mead to optimize the wildcard budget. Includes both aggregate and per-circuit constraints.

pygsti.optimize.optimize_wildcard_budget_percircuit_only_cvxpy(budget, L1weights, objfn, redbox_threshold, printer)

Uses CVXPY to optimize the wildcard budget. Includes only per-circuit constraints.

pygsti.optimize.optimize_wildcard_bisect_alpha(budget, objfn, two_dlogl_threshold, redbox_threshold, printer, guess=0.1, tol=0.001)
pygsti.optimize.optimize_wildcard_budget_cvxopt(budget, L1weights, objfn, two_dlogl_threshold, redbox_threshold, printer, abs_tol=1e-05, rel_tol=1e-05, max_iters=50)

Uses CVXOPT to optimize the wildcard budget. Includes both aggregate and per-circuit constraints.

pygsti.optimize.optimize_wildcard_budget_cvxopt_zeroreg(budget, L1weights, objfn, two_dlogl_threshold, redbox_threshold, printer, abs_tol=1e-05, rel_tol=1e-05, max_iters=50, small=1e-06)

Adds regularization of the L1 term around zero values of the budget. This doesn’t seem to help much.

pygsti.optimize.optimize_wildcard_budget_barrier(budget, L1weights, objfn, two_dlogl_threshold, redbox_threshold, printer, tol=1e-07, max_iters=50, num_steps=3, save_debugplot_data=False)

Uses a barrier method (for convex optimization) to optimize the wildcard budget. Includes both aggregate and per-circuit constraints.

pygsti.optimize.NewtonSolve(initial_x, fn, fn_with_derivs=None, dx_tol=1e-06, max_iters=20, printer=None, lmbda=0.0)
pygsti.optimize.optimize_wildcard_budget_cvxopt_smoothed(budget, L1weights, objfn, two_dlogl_threshold, redbox_threshold, printer, abs_tol=1e-05, rel_tol=1e-05, max_iters=50)

Uses a smooted version of the objective function. Doesn’t seem to help much.

The thinking here was to eliminate the 2nd derivative discontinuities of the original problem.