pygsti.optimize.customlm
Custom implementation of the Levenberg-Marquardt Algorithm
Module Contents
Classes
The result from an optimization. |
|
An optimizer. Optimizes an objective function. |
|
A Levenberg-Marquardt optimizer customized for GST-like problems. |
Functions
|
An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi. |
- class pygsti.optimize.customlm.OptimizerResult(objective_func, opt_x, opt_f=None, opt_jtj=None, opt_unpenalized_f=None, chi2_k_distributed_qty=None, optimizer_specific_qtys=None)
Bases:
object
The result from an optimization.
Parameters
- objective_funcObjectiveFunction
The objective function that was optimized.
- opt_xnumpy.ndarray
The optimal argument (x) value. Often a vector of parameters.
- opt_fnumpy.ndarray
the optimal objective function (f) value. Often this is the least-squares vector of objective function values.
- opt_jtjnumpy.ndarray, optional
the optimial dot(transpose(J),J) value, where J is the Jacobian matrix. This may be useful for computing approximate error bars.
- opt_unpenalized_fnumpy.ndarray, optional
the optimal objective function (f) value with any penalty terms removed.
- chi2_k_distributed_qtyfloat, optional
a value that is supposed to be chi2_k distributed.
- optimizer_specific_qtysdict, optional
a dictionary of additional optimization parameters.
- objective_func
- x
- f
- jtj
- f_no_penalties
- optimizer_specific_qtys
- chi2_k_distributed_qty
- class pygsti.optimize.customlm.Optimizer
Bases:
pygsti.baseobjs.nicelyserializable.NicelySerializable
An optimizer. Optimizes an objective function.
- class pygsti.optimize.customlm.CustomLMOptimizer(maxiter=100, maxfev=100, tol=1e-06, fditer=0, first_fditer=0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, serial_solve_proc_threshold=100, lsvec_mode='normal')
Bases:
Optimizer
A Levenberg-Marquardt optimizer customized for GST-like problems.
Parameters
- maxiterint, optional
The maximum number of (outer) interations.
- maxfevint, optional
The maximum function evaluations.
- tolfloat or dict, optional
The tolerance, specified as a single float or as a dict with keys {‘relx’, ‘relf’, ‘jac’, ‘maxdx’}. A single float sets the ‘relf’ and ‘jac’ elemments and leaves the others at their default values.
- fditerint optional
Internally compute the Jacobian using a finite-difference method for the first fditer iterations. This is useful when the initial point lies at a special or singular point where the analytic Jacobian is misleading.
- first_fditerint, optional
Number of finite-difference iterations applied to the first stage of the optimization (only). Unused.
- damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}
How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.
- damping_basis{‘diagonal_values’, ‘singular_values’}
Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).
- damping_cliptuple, optional
A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.
- use_accelerationbool, optional
Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.
- uphill_step_thresholdfloat, optional
Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.
- init_munutuple, optional
If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.
- oob_check_intervalint, optional
Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.
- oob_action{“reject”,”stop”}
What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.
- oob_check_modeint, optional
An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.
- serial_solve_proc_thresholdint, optional
When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.
- lsvec_mode{‘normal’, ‘percircuit’}
Whether the terms used in the least-squares optimization are the “elements” as computed by the objective function’s .terms() and .lsvec() methods (‘normal’ mode) or the “per-circuit quantities” computed by the objective function’s .percircuit() and .lsvec_percircuit() methods (‘percircuit’ mode).
- maxiter
- maxfev
- tol
- fditer
- first_fditer
- damping_mode
- damping_basis
- damping_clip
- use_acceleration
- uphill_step_threshold
- init_munu
- oob_check_interval
- oob_action
- oob_check_mode
- array_types
- called_objective_methods = "('lsvec', 'dlsvec')"
- serial_solve_proc_threshold
- lsvec_mode
- pygsti.optimize.customlm.custom_leastsq(obj_fn, jac_fn, x0, f_norm2_tol=1e-06, jac_norm_tol=1e-06, rel_ftol=1e-06, rel_xtol=1e-06, max_iter=100, num_fd_iters=0, max_dx_scale=1.0, damping_mode='identity', damping_basis='diagonal_values', damping_clip=None, use_acceleration=False, uphill_step_threshold=0.0, init_munu='auto', oob_check_interval=0, oob_action='reject', oob_check_mode=0, resource_alloc=None, arrays_interface=None, serial_solve_proc_threshold=100, x_limits=None, verbosity=0, profiler=None)
An implementation of the Levenberg-Marquardt least-squares optimization algorithm customized for use within pyGSTi.
This general purpose routine mimic to a large extent the interface used by scipy.optimize.leastsq, though it implements a newer (and more robust) version of the algorithm.
Parameters
- obj_fnfunction
The objective function. Must accept and return 1D numpy ndarrays of length N and M respectively. Same form as scipy.optimize.leastsq.
- jac_fnfunction
The jacobian function (not optional!). Accepts a 1D array of length N and returns an array of shape (M,N).
- x0numpy.ndarray
Initial evaluation point.
- f_norm2_tolfloat, optional
Tolerace for F^2 where F = `norm( sum(obj_fn(x)**2) ) is the least-squares residual. If F**2 < f_norm2_tol, then mark converged.
- jac_norm_tolfloat, optional
Tolerance for jacobian norm, namely if infn(dot(J.T,f)) < jac_norm_tol then mark converged, where infn is the infinity-norm and f = obj_fn(x).
- rel_ftolfloat, optional
Tolerance on the relative reduction in F^2, that is, if d(F^2)/F^2 < rel_ftol then mark converged.
- rel_xtolfloat, optional
Tolerance on the relative value of |x|, so that if d(|x|)/|x| < rel_xtol then mark converged.
- max_iterint, optional
The maximum number of (outer) interations.
- num_fd_itersint optional
Internally compute the Jacobian using a finite-difference method for the first num_fd_iters iterations. This is useful when x0 lies at a special or singular point where the analytic Jacobian is misleading.
- max_dx_scalefloat, optional
If not None, impose a limit on the magnitude of the step, so that |dx|^2 < max_dx_scale^2 * len(dx) (so elements of dx should be, roughly, less than max_dx_scale).
- damping_mode{‘identity’, ‘JTJ’, ‘invJTJ’, ‘adaptive’}
How damping is applied. ‘identity’ means that the damping parameter mu multiplies the identity matrix. ‘JTJ’ means that mu multiplies the diagonal or singular values (depending on scaling_mode) of the JTJ (Fischer information and approx. hessaian) matrix, whereas ‘invJTJ’ means mu multiplies the reciprocals of these values instead. The ‘adaptive’ mode adaptively chooses a damping strategy.
- damping_basis{‘diagonal_values’, ‘singular_values’}
Whether the the diagonal or singular values of the JTJ matrix are used during damping. If ‘singular_values’ is selected, then a SVD of the Jacobian (J) matrix is performed and damping is performed in the basis of (right) singular vectors. If ‘diagonal_values’ is selected, the diagonal values of relevant matrices are used as a proxy for the the singular values (saving the cost of performing a SVD).
- damping_cliptuple, optional
A 2-tuple giving upper and lower bounds for the values that mu multiplies. If damping_mode == “identity” then this argument is ignored, as mu always multiplies a 1.0 on the diagonal if the identity matrix. If None, then no clipping is applied.
- use_accelerationbool, optional
Whether to include a geodesic acceleration term as suggested in arXiv:1201.5885. This is supposed to increase the rate of convergence with very little overhead. In practice we’ve seen mixed results.
- uphill_step_thresholdfloat, optional
Allows uphill steps when taking two consecutive steps in nearly the same direction. The condition for accepting an uphill step is that (uphill_step_threshold-beta)*new_objective < old_objective, where beta is the cosine of the angle between successive steps. If uphill_step_threshold == 0 then no uphill steps are allowed, otherwise it should take a value between 1.0 and 2.0, with 1.0 being the most permissive to uphill steps.
- init_munutuple, optional
If not None, a (mu, nu) tuple of 2 floats giving the initial values for mu and nu.
- oob_check_intervalint, optional
Every oob_check_interval outer iterations, the objective function (obj_fn) is called with a second argument ‘oob_check’, set to True. In this case, obj_fn can raise a ValueError exception to indicate that it is Out Of Bounds. If oob_check_interval is 0 then this check is never performed; if 1 then it is always performed.
- oob_action{“reject”,”stop”}
What to do when the objective function indicates (by raising a ValueError as described above). “reject” means the step is rejected but the optimization proceeds; “stop” means the optimization stops and returns as converged at the last known-in-bounds point.
- oob_check_modeint, optional
An advanced option, expert use only. If 0 then the optimization is halted as soon as an attempt is made to evaluate the function out of bounds. If 1 then the optimization is halted only when a would-be accepted step is out of bounds.
- resource_allocResourceAllocation, optional
When not None, an resource allocation object used for distributing the computation across multiple processors.
- arrays_interfaceArraysInterface
An object that provides an interface for creating and manipulating data arrays.
- serial_solve_proc_thresholdint optional
When there are fewer than this many processors, the optimizer will solve linear systems serially, using SciPy on a single processor, rather than using a parallelized Gaussian Elimination (with partial pivoting) algorithm coded in Python. Since SciPy’s implementation is more efficient, it’s not worth using the parallel version until there are many processors to spread the work among.
- x_limitsnumpy.ndarray, optional
A (num_params, 2)-shaped array, holding on each row the (min, max) values for the corresponding parameter (element of the “x” vector). If None, then no limits are imposed.
- verbosityint, optional
Amount of detail to print to stdout.
- profilerProfiler, optional
A profiler object used for to track timing and memory usage.
Returns
- xnumpy.ndarray
The optimal solution.
- convergedbool
Whether the solution converged.
- msgstr
A message indicating why the solution converged (or didn’t).