pygsti.tools.likelihoodfns
Functions related to computation of the log-likelihood.
Module Contents
Functions
|
The log-likelihood function. |
|
Computes the per-circuit log-likelihood contribution for a set of circuits. |
|
The jacobian of the log-likelihood function. |
|
The hessian of the log-likelihood function. |
|
An approximate Hessian of the log-likelihood function. |
|
The maximum log-likelihood possible for a DataSet. |
|
The vector of maximum log-likelihood contributions for each circuit, aggregated over outcomes. |
|
See docstring for |
|
Twice the difference between the maximum and actual log-likelihood. |
|
Twice the per-circuit difference between the maximum and actual log-likelihood. |
|
Term of the 2*[log(L)-upper-bound - log(L)] sum corresponding to a single circuit and spam label. |
Attributes
- pygsti.tools.likelihoodfns.TOL = '1e-20'
- pygsti.tools.likelihoodfns.logl(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, wildcard=None, mdc_store=None, comm=None, mem_limit=None)
The log-likelihood function.
Parameters
- modelModel
Model of parameterized gates
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the returned logl value.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- wildcardWildcardBudget
A wildcard budget to apply to this log-likelihood computation. This increases the returned log-likelihood value by adjusting (by a maximal amount measured in TVD, given by the budget) the probabilities produced by model to optimially match the data (within the bugetary constraints) evaluating the log-likelihood.
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
- mem_limitint, optional
A rough memory limit in bytes which restricts the amount of intermediate values that are computed and stored.
Returns
- float
The log likelihood
- pygsti.tools.likelihoodfns.logl_per_circuit(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, wildcard=None, mdc_store=None, comm=None, mem_limit=None)
Computes the per-circuit log-likelihood contribution for a set of circuits.
Parameters
- modelModel
Model of parameterized gates
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the returned logl value.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- wildcardWildcardBudget
A wildcard budget to apply to this log-likelihood computation. This increases the returned log-likelihood value by adjusting (by a maximal amount measured in TVD, given by the budget) the probabilities produced by model to optimially match the data (within the bugetary constraints) evaluating the log-likelihood.
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
- mem_limitint, optional
A rough memory limit in bytes which restricts the amount of intermediate values that are computed and stored.
Returns
- numpy.ndarray
Array of length either len(circuits) or len(dataset.keys()). Values are the log-likelihood contributions of the corresponding circuit aggregated over outcomes.
- pygsti.tools.likelihoodfns.logl_jacobian(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, mdc_store=None, comm=None, mem_limit=None, verbosity=0)
The jacobian of the log-likelihood function.
Parameters
- modelModel
Model of parameterized gates (including SPAM)
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the Poisson-picutre log-likelihood should be differentiated.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
- mem_limitint, optional
A rough memory limit in bytes which restricts the amount of intermediate values that are computed and stored.
- verbosityint, optional
How much detail to print to stdout.
Returns
- numpy array
array of shape (M,), where M is the length of the vectorized model.
- pygsti.tools.likelihoodfns.logl_hessian(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, mdc_store=None, comm=None, mem_limit=None, verbosity=0)
The hessian of the log-likelihood function.
Parameters
- modelModel
Model of parameterized gates (including SPAM)
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the Poisson-picutre log-likelihood should be differentiated.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
- mem_limitint, optional
A rough memory limit in bytes which restricts the amount of intermediate values that are computed and stored.
- verbosityint, optional
How much detail to print to stdout.
Returns
- numpy array or None
On the root processor, the Hessian matrix of shape (nModelParams, nModelParams), where nModelParams = model.num_params. None on non-root processors.
- pygsti.tools.likelihoodfns.logl_approximate_hessian(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, mdc_store=None, comm=None, mem_limit=None, verbosity=0)
An approximate Hessian of the log-likelihood function.
An approximation to the true Hessian is computed using just the Jacobian (and not the Hessian) of the probabilities w.r.t. the model parameters. Let J = d(probs)/d(params) and denote the Hessian of the log-likelihood w.r.t. the probabilities as d2(logl)/dprobs2 (a diagonal matrix indexed by the term, i.e. probability, of the log-likelihood). Then this function computes:
H = J * d2(logl)/dprobs2 * J.T
Which simply neglects the d2(probs)/d(params)2 terms of the true Hessian. Since this curvature is expected to be small at the MLE point, this approximation can be useful for computing approximate error bars.
Parameters
- modelModel
Model of parameterized gates (including SPAM)
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the Poisson-picutre log-likelihood should be differentiated.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
- mem_limitint, optional
A rough memory limit in bytes which restricts the amount of intermediate values that are computed and stored.
- verbosityint, optional
How much detail to print to stdout.
Returns
- numpy array or None
On the root processor, the approximate Hessian matrix of shape (nModelParams, nModelParams), where nModelParams = model.num_params. None on non-root processors.
- pygsti.tools.likelihoodfns.logl_max(model, dataset, circuits=None, poisson_picture=True, op_label_aliases=None, mdc_store=None)
The maximum log-likelihood possible for a DataSet.
That is, the log-likelihood obtained by a maximal model that can fit perfectly the probability of each circuit.
Parameters
- modelModel
the model, used only for circuit compilation
- datasetDataSet
the data set to use.
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the max-log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- poisson_pictureboolean, optional
Whether the Poisson-picture maximum log-likelihood should be returned.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
Returns
float
- pygsti.tools.likelihoodfns.logl_max_per_circuit(model, dataset, circuits=None, poisson_picture=True, op_label_aliases=None, mdc_store=None)
The vector of maximum log-likelihood contributions for each circuit, aggregated over outcomes.
Parameters
- modelModel
the model, used only for circuit compilation
- datasetDataSet
the data set to use.
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the max-log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- poisson_pictureboolean, optional
Whether the Poisson-picture maximum log-likelihood should be returned.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
Returns
- numpy.ndarray
Array of length either len(circuits) or len(dataset.keys()). Values are the maximum log-likelihood contributions of the corresponding circuit aggregated over outcomes.
- pygsti.tools.likelihoodfns.two_delta_logl_nsigma(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, dof_calc_method='modeltest', wildcard=None)
See docstring for
pygsti.tools.two_delta_logl()
Parameters
- modelModel
Model of parameterized gates
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the returned logl value.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- dof_calc_method{“all”, “modeltest”}
How model’s number of degrees of freedom (parameters) are obtained when computing the number of standard deviations and p-value relative to a chi2_k distribution, where k is additional degrees of freedom possessed by the maximal model. “all” uses model.num_params whereas “modeltest” uses model.num_modeltest_params (the number of non-gauge parameters by default).
- wildcardWildcardBudget
A wildcard budget to apply to this log-likelihood computation. This increases the returned log-likelihood value by adjusting (by a maximal amount measured in TVD, given by the budget) the probabilities produced by model to optimially match the data (within the bugetary constraints) evaluating the log-likelihood.
Returns
float
- pygsti.tools.likelihoodfns.two_delta_logl(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, dof_calc_method=None, wildcard=None, mdc_store=None, comm=None)
Twice the difference between the maximum and actual log-likelihood.
Optionally also can return the Nsigma (# std deviations from mean) and p-value relative to expected chi^2 distribution (when dof_calc_method is not None).
This function’s arguments are supersets of
logl()
, andlogl_max()
. This is a convenience function, equivalent to 2*(logl_max(…) - logl(…)), whose value is what is often called the log-likelihood-ratio between the “maximal model” (that which trivially fits the data exactly) and the model given by model.Parameters
- modelModel
Model of parameterized gates
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the computed log-likelihood values.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- dof_calc_method{None, “all”, “modeltest”}
How model’s number of degrees of freedom (parameters) are obtained when computing the number of standard deviations and p-value relative to a chi2_k distribution, where k is additional degrees of freedom possessed by the maximal model. If None, then Nsigma and pvalue are not returned (see below).
- wildcardWildcardBudget
A wildcard budget to apply to this log-likelihood computation. This increases the returned log-likelihood value by adjusting (by a maximal amount measured in TVD, given by the budget) the probabilities produced by model to optimially match the data (within the bugetary constraints) evaluating the log-likelihood.
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
Returns
- twoDeltaLogLfloat
2*(loglikelihood(maximal_model,data) - loglikelihood(model,data))
- Nsigma, pvaluefloat
Only returned when dof_calc_method is not None.
- pygsti.tools.likelihoodfns.two_delta_logl_per_circuit(model, dataset, circuits=None, min_prob_clip=1e-06, prob_clip_interval=(-1000000.0, 1000000.0), radius=0.0001, poisson_picture=True, op_label_aliases=None, dof_calc_method=None, wildcard=None, mdc_store=None, comm=None)
Twice the per-circuit difference between the maximum and actual log-likelihood.
Contributions are aggregated over each circuit’s outcomes, but no further.
Optionally (when dof_calc_method is not None) returns parallel vectors containing the Nsigma (# std deviations from mean) and the p-value relative to expected chi^2 distribution for each sequence.
Parameters
- modelModel
Model of parameterized gates
- datasetDataSet
Probability data
- circuitslist of (tuples or Circuits), optional
Each element specifies a circuit to include in the log-likelihood sum. Default value of None implies all the circuits in dataset should be used.
- min_prob_clipfloat, optional
The minimum probability treated normally in the evaluation of the log-likelihood. A penalty function replaces the true log-likelihood for probabilities that lie below this threshold so that the log-likelihood never becomes undefined (which improves optimizer performance).
- prob_clip_interval2-tuple or None, optional
(min,max) values used to clip the probabilities predicted by models during MLEGST’s search for an optimal model (if not None). if None, no clipping is performed.
- radiusfloat, optional
Specifies the severity of rounding used to “patch” the zero-frequency terms of the log-likelihood.
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the returned logl value.
- op_label_aliasesdictionary, optional
Dictionary whose keys are operation label “aliases” and whose values are tuples corresponding to what that operation label should be expanded into before querying the dataset. Defaults to the empty dictionary (no aliases defined) e.g. op_label_aliases[‘Gx^3’] = (‘Gx’,’Gx’,’Gx’)
- dof_calc_method{“all”, “modeltest”}
How model’s number of degrees of freedom (parameters) are obtained when computing the number of standard deviations and p-value relative to a chi2_k distribution, where k is additional degrees of freedom possessed by the maximal model.
- wildcardWildcardBudget
A wildcard budget to apply to this log-likelihood computation. This increases the returned log-likelihood value by adjusting (by a maximal amount measured in TVD, given by the budget) the probabilities produced by model to optimially match the data (within the bugetary constraints) evaluating the log-likelihood.
- mdc_storeModelDatasetCircuitsStore, optional
An object that bundles cached quantities along with a given model, dataset, and circuit list. If given, model and dataset and circuits should be set to None.
- commmpi4py.MPI.Comm, optional
When not None, an MPI communicator for distributing the computation across multiple processors.
Returns
twoDeltaLogL_terms : numpy.ndarray
- Nsigma, pvaluenumpy.ndarray
Only returned when dof_calc_method is not None.
- pygsti.tools.likelihoodfns.two_delta_logl_term(n, p, f, min_prob_clip=1e-06, poisson_picture=True)
Term of the 2*[log(L)-upper-bound - log(L)] sum corresponding to a single circuit and spam label.
Parameters
- nfloat or numpy array
Number of samples.
- pfloat or numpy array
Probability of 1st outcome (typically computed).
- ffloat or numpy array
Frequency of 1st outcome (typically observed).
- min_prob_clipfloat, optional
Minimum probability clip point to avoid evaluating log(number <= zero)
- poisson_pictureboolean, optional
Whether the log-likelihood-in-the-Poisson-picture terms should be included in the returned logl value.
Returns
float or numpy array