pygsti.protocols.vbdataframe

Techniques for manipulating benchmarking data stored in a Pandas DataFrame.

Module Contents

Classes

VBDataFrame

A class for storing a DataFrame that contains volumetric benchmarking data, and that

Functions

_calculate_summary_statistic(x, statistic, lower_cutoff=None)

Utility function that returns statistic(x), or the maximum

polarization_to_success_probability(p, n)

Inverse of success_probability_to_polarization.

success_probability_to_polarization(s, n)

Maps a success probablity s for an n-qubit circuit to

classify_circuit_shape(success_probabilities, total_counts, threshold, significance=0.05)

Utility function for computing "capability regions", as introduced in "Measuring the

pygsti.protocols.vbdataframe._calculate_summary_statistic(x, statistic, lower_cutoff=None)

Utility function that returns statistic(x), or the maximum of statistic(x) and lower_cutoff if lower_cutoff is not None.

pygsti.protocols.vbdataframe.polarization_to_success_probability(p, n)

Inverse of success_probability_to_polarization.

pygsti.protocols.vbdataframe.success_probability_to_polarization(s, n)

Maps a success probablity s for an n-qubit circuit to the polarization, defined by p = (s - 1/2^n)/(1 - 1/2^n)

pygsti.protocols.vbdataframe.classify_circuit_shape(success_probabilities, total_counts, threshold, significance=0.05)

Utility function for computing “capability regions”, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294.

Returns an integer that classifies the input list of success probabilities (SPs) as either

– “success”: all SPs above the specified threshold, specified by the int 2. – “indeterminate”: some SPs are above and some are below the threshold, specified by the int 1. – “fail”: all SPs are below the threshold, specified by the int 0.

This classification is based on a hypothesis test whereby the null hypothesis is “success” or “fail”. That is, the set of success probabilities are designated to be “indeterminate” only if there is statistically significant evidence that at least one success probabilities is above the threshold, and at least one is below. The details of this hypothesis testing are given in the Section 8.B.5 in the Supplement of arXiv:2008.11294.

Parameters
  • success_probabilities (list) – List of success probabilities, should all be in [0,1].

  • total_counts (list) – The number of samples from which the success probabilities where computed.

  • threshold (float) – The threshold for designating a success probability as “success”.

  • significance (float, optional) – The statistical significance for the hypothesis test.

Returns

  • int in (2, 1, 0), corresponding to (“success”, “indeterminate”, “fail”) classifications.

  • If the SPs list is length 0 then NaN is returned, and if it contains only NaN elements then

  • 0 is returned. Otherwise, all NaN elements are ignored.

class pygsti.protocols.vbdataframe.VBDataFrame(df, x_axis='Depth', y_axis='Width', x_values=None, y_values=None, edesign=None)

Bases: object

A class for storing a DataFrame that contains volumetric benchmarking data, and that has methods for manipulating that data and creating volumetric-benchmarking-like plots.

select_column_value(self, column_label, column_value)

Filters the dataframe, by discarding all rows of the dataframe for which the column labelled column_label does not have column_value.

Parameters
  • column_label (string) – The label of the column whose value is to be filtered on.

  • column_value (varied) – The value of the column.

Returns

VBDataFrame – A new VBDataFrame that has had the filtering applied to its dataframe.

filter_data(self, column_label, metric='polarization', statistic='mean', indep_x=True, threshold=1 / _np.e, verbosity=0)

Filters the dataframe, by selecting the “best” value at each (x, y) (typically corresponding to circuit shape) for the column specified by column_label. Returns a VBDataFrame whose data that contains only one value for the column labelled by column_label for each (x, y).

Parameters
  • column_label (string) – The label of the column whose “best” value at each circuit shape is to be selected. For example, this could be “Qubits”, to select only the data for the best qubit subset at each circuit shape.

  • metric (string, optional) – The data to be used as the figure-of-merit for performance at each (x, y). Must be a column of the dataframe.

  • statistics (string, optional) – The statistic to apply to the data specified by metric the data at (x, y) into a scalar. Allowed values are: - ‘max’ - ‘min’ - ‘mean’

  • indep_x (bool, optional) – If True, then an independent value, for the column, is selected at each (x, y) value. If False, then the same value for the column is selected for every x value for a given y.

  • threshold (float, optional.) – Does nothing if indep_x is True. If indep_x is False, then ‘metric’ and ‘statistic’ are not enough to uniquely decide which column value is best. In this case, the value is chosen that, for each y in (x,y), maximizes the x value at which the figure-of-merit (as specified by the metric and statistic) drops below the threshold. If there are multiple values that drop below the threshold at the same x (or the figure-of-merit never drops below the threshold for multiple values), then value with the larger figure-of-merit at that x is chosen.

Returns

VBDataFrame – A new VBDataFrame that has had the filtering applied to its dataframe.

vb_data(self, metric='polarization', statistic='mean', lower_cutoff=0.0, no_data_action='discard')

Converts the data into a dictionary, for plotting in a volumetric benchmarking plot. For each (x, y) value (as specified by the axes of this VBDataFrame, and typically circuit shape), pools all of the data specified by metric with that (x, y) and computes the statistic on that data defined by statistic.

Parameters
  • metric (string, optional) – The type of data. Must be a column of the dataframe.

  • statistics (string, optional) –

    The statistic on the data to be computed at each value of (x, y). Options are:

    • ’max’: the maximum

    • ’min’: the minimum.

    • ’mean’: the mean.

    • ’monotonic_max’: the maximum of all the data with (x, y) values that are that large or larger

    • ’monotonic_min’: the minimum of all the data with (x, y) values that are that small or smaller

  • values. (All these options ignore nan) –

  • lower_cutoff (float, optional) – The value to cutoff the statistic at: takes the maximum of the calculated static and this value.

  • no_data_action (string, optional) –

    Sets what to do when there is no data, or only NaN data, at an (x, y) value: - If ‘discard’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y)

    value will not be a key in the returned dictionary

    • If ‘nan’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be NaN.

    • If ‘min’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be the minimal value allowed for this statistic, as specified by lower_cutoff.

Returns

dict – A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are the VB data at that (x, y).

capability_regions(self, metric='polarization', threshold=1 / _np.e, significance=0.05, monotonic=True, nan_data_action='discard')

Computes a “capability region” from the data, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294. Classifies each (x,y) value (as specified by the x and y axes of the VBDataFrame, which are typically width and depth) as either “success” (the int 2), “indeterminate” (the int 1), “fail” (the int 0), or “no data” (NaN).

Parameters
  • metric (string, optional) – The type of data. Must be ‘polarization’ or ‘success_probability’, and this must be a column in the dataframe.

  • threshold (float, optional) – The threshold for “success”.

  • significance (float, optional) – The statistical significance for the hypothesis tests that are used to classify each circuit shape.

  • monotonic (bool, optional) – If True, makes the region monotonic, i,e, if (x’,y’) > (x,y) then the classification for (x’,y’) is less/worse than for (x,y).

  • no_data_action (string, optional) – If ‘discard’ then when there is no data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary. Otherwise the value will be NaN.

Returns

dict – A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are in (2, 1, 0, NaN).