pygsti.optimize.customlm

Custom implementation of the Levenberg-Marquardt Algorithm

Module Contents

Classes

OptimizerResult

The result from an optimization.

Optimizer

An optimizer. Optimizes an objective function.

CustomLMOptimizer

A Levenberg-Marquardt optimizer customized for GST-like problems.

Functions

custom_leastsq(obj_fn, jac_fn, x0[, f_norm2_tol, ...])

An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi.

class pygsti.optimize.customlm.OptimizerResult(objective_func, opt_x, opt_f=None, opt_jtj=None, opt_unpenalized_f=None, chi2_k_distributed_qty=None, optimizer_specific_qtys=None)

Bases: object

The result from an optimization.

Parameters

objective_funcObjectiveFunction

The objective function that was optimized.

opt_xnumpy.ndarray

The optimal argument (x) value. Often a vector of parameters.

opt_fnumpy.ndarray

the optimal objective function (f) value. Often this is the least-squares vector of objective function values.

opt_jtjnumpy.ndarray, optional

the optimial dot(transpose(J),J) value, where J is the Jacobian matrix. This may be useful for computing approximate error bars.

opt_unpenalized_fnumpy.ndarray, optional

the optimal objective function (f) value with any penalty terms removed.

chi2_k_distributed_qtyfloat, optional

a value that is supposed to be chi2_k distributed.

optimizer_specific_qtysdict, optional

a dictionary of additional optimization parameters.

class pygsti.optimize.customlm.Optimizer

Bases: pygsti.baseobjs.nicelyserializable.NicelySerializable

An optimizer. Optimizes an objective function.

classmethod cast(obj)

Cast obj to a Optimizer.

If obj is already an Optimizer it is just returned, otherwise this function tries to create a new object using obj as a dictionary of constructor arguments.

Parameters

objOptimizer or dict

The object to cast.

Returns

Optimizer

class pygsti.optimize.customlm.CustomLMOptimizer(maxiter=100, maxfev=100, tol=1e-06, fditer=0, first_fditer=0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, serial_solve_proc_threshold=100, lsvec_mode='normal')

Bases: Optimizer

A Levenberg-Marquardt optimizer customized for GST-like problems.

Parameters

maxiterint, optional

The maximum number of (outer) interations.

maxfevint, optional

The maximum function evaluations.

tolfloat or dict, optional

The tolerance, specified as a single float or as a dict with keys {‘relx’, ‘relf’, ‘jac’, ‘maxdx’}. A single float sets the ‘relf’ and ‘jac’ elemments and leaves the others at their default values.

fditerint optional

Internally compute the Jacobian using a finite-difference method for the first fditer iterations. This is useful when the initial point lies at a special or singular point where the analytic Jacobian is misleading.

first_fditerint, optional

Number of finite-difference iterations applied to the first stage of the optimization (only). Unused.

damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}

How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.

damping_basis{‘diagonal_values’, ‘singular_values’}

Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).

damping_cliptuple, optional

A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.

use_accelerationbool, optional

Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.

uphill_step_thresholdfloat, optional

Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.

init_munutuple, optional

If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.

oob_check_intervalint, optional

Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.

oob_action{“reject”,”stop”}

What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.

oob_check_modeint, optional

An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.

serial_solve_proc_thresholdint, optional

When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.

lsvec_mode{‘normal’, ‘percircuit’}

Whether the terms used in the least-squares optimization are the “elements” as computed by the objective function’s .terms() and .lsvec() methods (‘normal’ mode) or the “per-circuit quantities” computed by the objective function’s .percircuit() and .lsvec_percircuit() methods (‘percircuit’ mode).

run(objective, profiler, printer)

Perform the optimization.

Parameters
objectiveObjectiveFunction

The objective function to optimize.

profilerProfiler

A profiler to track resource usage.

printerVerbosityPrinter

printer to use for sending output to stdout.

pygsti.optimize.customlm.custom_leastsq(obj_fn, jac_fn, x0, f_norm2_tol=1e-06, jac_norm_tol=1e-06, rel_ftol=1e-06, rel_xtol=1e-06, max_iter=100, num_fd_iters=0, max_dx_scale=1.0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, resource_alloc=None, arrays_interface=None, serial_solve_proc_threshold=100, x_limits=None, verbosity=0, profiler=None)

An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi.

This general purpose routine mimic to a large extent the interface used by scipy.optimize.leastsq, though it implements a newer (and more robust) version of the algorithm.

Parameters

obj_fnfunction

The objective function. Must accept and return 1D numpy ndarrays of length N and M respectively. Same form as scipy.optimize.leastsq.

jac_fnfunction

The jacobian function (not optional!). Accepts a 1D array of length N and returns an array of shape (M,N).

x0numpy.ndarray

Initial evaluation point.

f_norm2_tolfloat, optional

Tolerace for F^2 where F = `norm( sum(obj_fn(x)**2) ) is the least-squares residual. If F**2 < f_norm2_tol, then mark converged.

jac_norm_tolfloat, optional

Tolerance for jacobian norm, namely if infn(dot(J.T,f)) < jac_norm_tol then mark converged, where infn is the infinity-norm and f = obj_fn(x).

rel_ftolfloat, optional

Tolerance on the relative reduction in F^2, that is, if d(F^2)/F^2 < rel_ftol then mark converged.

rel_xtolfloat, optional

Tolerance on the relative value of |x|, so that if d(|x|)/|x| < rel_xtol then mark converged.

max_iterint, optional

The maximum number of (outer) interations.

num_fd_itersint optional

Internally compute the Jacobian using a finite-difference method for the first num_fd_iters iterations. This is useful when x0 lies at a special or singular point where the analytic Jacobian is misleading.

max_dx_scalefloat, optional

If not None, impose a limit on the magnitude of the step, so that |dx|^2 < max_dx_scale^2 * len(dx) (so elements of dx should be, roughly, less than max_dx_scale).

damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}

How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.

damping_basis{‘diagonal_values’, ‘singular_values’}

Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).

damping_cliptuple, optional

A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.

use_accelerationbool, optional

Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.

uphill_step_thresholdfloat, optional

Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.

init_munutuple, optional

If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.

oob_check_intervalint, optional

Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.

oob_action{“reject”,”stop”}

What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.

oob_check_modeint, optional

An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.

resource_allocResourceAllocation, optional

When not None, an resource allocation object used for distributing the computation across multiple processors.

arrays_interfaceArraysInterface

An object that provides an interface for creating and manipulating data arrays.

serial_solve_proc_thresholdint optional

When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.

x_limitsnumpy.ndarray, optional

A (num_params, 2)-shaped array, holding on each row the (min, max) values for the corresponding parameter (element of the “x” vector). If None, then no limits are imposed.

verbosityint, optional

Amount of detail to print to stdout.

profilerProfiler, optional

A profiler object used for to track timing and memory usage.

Returns

xnumpy.ndarray

The optimal solution.

convergedbool

Whether the solution converged.

msgstr

A message indicating why the solution converged (or didn’t).