pygsti.protocols.vbdataframe

Techniques for manipulating benchmarking data stored in a Pandas DataFrame.

Module Contents

Classes

VBDataFrame

A class for storing a DataFrame that contains volumetric benchmarking data, and that

Functions

polarization_to_success_probability(p, n)

Inverse of success_probability_to_polarization.

success_probability_to_polarization(s, n)

Maps a success probablity s for an n-qubit circuit to

classify_circuit_shape(success_probabilities, ...[, ...])

Utility function for computing "capability regions", as introduced in "Measuring the

pygsti.protocols.vbdataframe.polarization_to_success_probability(p, n)

Inverse of success_probability_to_polarization.

pygsti.protocols.vbdataframe.success_probability_to_polarization(s, n)

Maps a success probablity s for an n-qubit circuit to the polarization, defined by p = (s - 1/2^n)/(1 - 1/2^n)

pygsti.protocols.vbdataframe.classify_circuit_shape(success_probabilities, total_counts, threshold, significance=0.05)

Utility function for computing “capability regions”, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294.

Returns an integer that classifies the input list of success probabilities (SPs) as either

  • “success”: all SPs above the specified threshold, specified by the int 2.

  • “indeterminate”: some SPs are above and some are below the threshold, specified by the int 1.

  • “fail”: all SPs are below the threshold, specified by the int 0.

This classification is based on a hypothesis test whereby the null hypothesis is “success” or “fail”. That is, the set of success probabilities are designated to be “indeterminate” only if there is statistically significant evidence that at least one success probabilities is above the threshold, and at least one is below. The details of this hypothesis testing are given in the Section 8.B.5 in the Supplement of arXiv:2008.11294.

Parameters

success_probabilitieslist

List of success probabilities, should all be in [0,1].

total_countslist

The number of samples from which the success probabilities where computed.

thresholdfloat

The threshold for designating a success probability as “success”.

significancefloat, optional

The statistical significance for the hypothesis test.

Returns

int in (2, 1, 0), corresponding to (“success”, “indeterminate”, “fail”) classifications. If the SPs list is length 0 then NaN is returned, and if it contains only NaN elements then 0 is returned. Otherwise, all NaN elements are ignored.

class pygsti.protocols.vbdataframe.VBDataFrame(df, x_axis='Depth', y_axis='Width', x_values=None, y_values=None, edesign=None)

Bases: object

A class for storing a DataFrame that contains volumetric benchmarking data, and that has methods for manipulating that data and creating volumetric-benchmarking-like plots.

Initialize a VBDataFrame object.

Parameters

dfPandas DataFrame

A DataFrame that contains the volumetric benchmarking data. This sort of DataFrame can be created using ByBepthSummaryStatics protocols and the to_dataframe() method of the created results object.

x_axisstring, optional

A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the x-axis of these plots should be. It should be a column label in the DataFrame.

y_axisstring, optional

A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the y-axis of these plots should be. It should be a column label in the DataFrame.

x_values : string or None, optional

x_values : string or None, optional

edesignExperimentDesign or None, optional

The ExperimentDesign that corresponds to the data in the dataframe. This is not currently used by any methods in the VBDataFrame.

select_column_value(column_label, column_value)

Filters the dataframe, by discarding all rows of the dataframe for which the column labelled column_label does not have column_value.

Parameters
column_labelstring

The label of the column whose value is to be filtered on.

column_valuevaried

The value of the column.

Returns
VBDataFrame

A new VBDataFrame that has had the filtering applied to its dataframe.

filter_data(column_label, metric='polarization', statistic='mean', indep_x=True, threshold=1 / _np.e, verbosity=0)

Filters the dataframe, by selecting the “best” value at each (x, y) (typically corresponding to circuit shape) for the column specified by column_label. Returns a VBDataFrame whose data that contains only one value for the column labelled by column_label for each (x, y).

Parameters
column_labelstring

The label of the column whose “best” value at each circuit shape is to be selected. For example, this could be “Qubits”, to select only the data for the best qubit subset at each circuit shape.

metricstring, optional

The data to be used as the figure-of-merit for performance at each (x, y). Must be a column of the dataframe.

statisticsstring, optional

The statistic to apply to the data specified by metric the data at (x, y) into a scalar. Allowed values are: - ‘max’ - ‘min’ - ‘mean’

indep_xbool, optional

If True, then an independent value, for the column, is selected at each (x, y) value. If False, then the same value for the column is selected for every x value for a given y.

thresholdfloat, optional.

Does nothing if indep_x is True. If indep_x is False, then ‘metric’ and ‘statistic’ are not enough to uniquely decide which column value is best. In this case, the value is chosen that, for each y in (x,y), maximizes the x value at which the figure-of-merit (as specified by the metric and statistic) drops below the threshold. If there are multiple values that drop below the threshold at the same x (or the figure-of-merit never drops below the threshold for multiple values), then value with the larger figure-of-merit at that x is chosen.

Returns
VBDataFrame

A new VBDataFrame that has had the filtering applied to its dataframe.

vb_data(metric='polarization', statistic='mean', lower_cutoff=0.0, no_data_action='discard')

Converts the data into a dictionary, for plotting in a volumetric benchmarking plot. For each (x, y) value (as specified by the axes of this VBDataFrame, and typically circuit shape), pools all of the data specified by metric with that (x, y) and computes the statistic on that data defined by statistic.

Parameters
metricstring, optional

The type of data. Must be a column of the dataframe.

statisticsstring, optional

The statistic on the data to be computed at each value of (x, y). Options are:

  • ‘max’: the maximum

  • ‘min’: the minimum.

  • ‘mean’: the mean.

  • ‘monotonic_max’: the maximum of all the data with (x, y) values that are that large or larger

  • ‘monotonic_min’: the minimum of all the data with (x, y) values that are that small or smaller

All these options ignore nan values.

lower_cutofffloat, optional

The value to cutoff the statistic at: takes the maximum of the calculated static and this value.

no_data_action: string, optional

Sets what to do when there is no data, or only NaN data, at an (x, y) value:

  • If ‘discard’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary

  • If ‘nan’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be NaN.

  • If ‘min’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be the minimal value allowed for this statistic, as specified by lower_cutoff.

Returns
dict

A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are the VB data at that (x, y).

capability_regions(metric='polarization', threshold=1 / _np.e, significance=0.05, monotonic=True, nan_data_action='discard')

Computes a “capability region” from the data, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294. Classifies each (x,y) value (as specified by the x and y axes of the VBDataFrame, which are typically width and depth) as either “success” (the int 2), “indeterminate” (the int 1), “fail” (the int 0), or “no data” (NaN).

Parameters
metricstring, optional

The type of data. Must be ‘polarization’ or ‘success_probability’, and this must be a column in the dataframe.

thresholdfloat, optional

The threshold for “success”.

significancefloat, optional

The statistical significance for the hypothesis tests that are used to classify each circuit shape.

monotonicbool, optional

If True, makes the region monotonic, i,e, if (x’,y’) > (x,y) then the classification for (x’,y’) is less/worse than for (x,y).

no_data_actionstring, optional

If ‘discard’ then when there is no data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary. Otherwise the value will be NaN.

Returns
dict

A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are in (2, 1, 0, NaN).