pygsti.protocols.vbdataframe

Techniques for manipulating benchmarking data stored in a Pandas DataFrame.

Module Contents

Classes

VBDataFrame

A class for storing a DataFrame that contains volumetric benchmarking data, and that

Functions

classify_circuit_shape(success_probabilities, ...[, ...])

Utility function for computing "capability regions", as introduced in "Measuring the

pygsti.protocols.vbdataframe.classify_circuit_shape(success_probabilities, total_counts, threshold, significance=0.05)

Utility function for computing “capability regions”, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294.

Returns an integer that classifies the input list of success probabilities (SPs) as either

  • “success”: all SPs above the specified threshold, specified by the int 2.

  • “indeterminate”: some SPs are above and some are below the threshold, specified by the int 1.

  • “fail”: all SPs are below the threshold, specified by the int 0.

This classification is based on a hypothesis test whereby the null hypothesis is “success” or “fail”. That is, the set of success probabilities are designated to be “indeterminate” only if there is statistically significant evidence that at least one success probabilities is above the threshold, and at least one is below. The details of this hypothesis testing are given in the Section 8.B.5 in the Supplement of arXiv:2008.11294.

Parameters

success_probabilitieslist

List of success probabilities, should all be in [0,1].

total_countslist

The number of samples from which the success probabilities where computed.

thresholdfloat

The threshold for designating a success probability as “success”.

significancefloat, optional

The statistical significance for the hypothesis test.

Returns

int in (2, 1, 0), corresponding to (“success”, “indeterminate”, “fail”) classifications. If the SPs list is length 0 then NaN is returned, and if it contains only NaN elements then 0 is returned. Otherwise, all NaN elements are ignored.

class pygsti.protocols.vbdataframe.VBDataFrame(df: pandas.DataFrame, x_axis: str = 'Depth', y_axis: str = 'Width', x_values: str | None = None, y_values: str | None = None, edesign: pygsti.protocols.ExperimentDesign | None = None)

Bases: object

A class for storing a DataFrame that contains volumetric benchmarking data, and that has methods for manipulating that data and creating volumetric-benchmarking-like plots.

Initialize a VBDataFrame object.

Parameters

dfPandas DataFrame

A DataFrame that contains the volumetric benchmarking data. This sort of DataFrame can be created using ByBepthSummaryStatics protocols and the to_dataframe() method of the created results object.

x_axisstring, optional

A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the x-axis of these plots should be. It should be a column label in the DataFrame.

y_axisstring, optional

A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the y-axis of these plots should be. It should be a column label in the DataFrame.

x_values : string or None, optional

x_values : string or None, optional

edesignExperimentDesign or None, optional

The ExperimentDesign that corresponds to the data in the dataframe. This is not currently used by any methods in the VBDataFrame.

dataframe
x_axis = "'Depth'"
y_axis = "'Width'"
edesign = 'None'
classmethod from_mirror_experiment(unmirrored_design: pygsti.protocols.FreeformDesign, mirrored_data: pygsti.protocols.ProtocolData, include_dropped_gates: bool = False, bootstrap: bool = True, num_bootstraps: int = 50, rand_state: numpy.random.RandomState | None = None, verbose: bool = False) VBDataFrame

Create a dataframe from MCFE data and edesigns.

Parameters
unmirrored_design: pygsti.protocols.protocol.FreeformDesign

Edesign containing the circuits whose process fidelities are to be estimated.

mirrored_data: pygsti.protocols.protocol.ProtocolData

Data object containing the full mirror edesign and the outcome counts for each circuit in the full mirror edesign.

include_dropped_gates: bool

Whether to include the number of gates dropped from each subcircuit during subcircuit creation. This flag should be set to False for noise benchmark and fullstack benchmark analysis, but can be set to True for subcircuit benchmark analysis. Default is False.

bootstrap: bool

Toggle the calculation of error bars from bootstrapped process fidelity calculations. If True, error bars are calculated. If False, error bars are not calculated.

num_bootstraps: int

Number of samples to draw from the bootstrapped process fidelity calculations. This argument is ignored if ‘bootstrap’ is False.

rand_state: np.random.RandomState

random state used to seed bootstrapping. If ‘bootstrap’ is set to False, this argument is ignored.

verbose: bool

Toggle print statements with debug information. If True, print statements are turned on. If False, print statements are omitted.

Returns
VBDataFrame

A VBDataFrame whose dataframe contains calculated MCFE values and circuit statistics.

select_column_value(column_label, column_value)

Filters the dataframe, by discarding all rows of the dataframe for which the column labelled column_label does not have column_value.

Parameters
column_labelstring

The label of the column whose value is to be filtered on.

column_valuevaried

The value of the column.

Returns
VBDataFrame

A new VBDataFrame that has had the filtering applied to its dataframe.

filter_data(column_label, metric='polarization', statistic='mean', indep_x=True, threshold=1 / _np.e, verbosity=0)

Filters the dataframe, by selecting the “best” value at each (x, y) (typically corresponding to circuit shape) for the column specified by column_label. Returns a VBDataFrame whose data that contains only one value for the column labelled by column_label for each (x, y).

Parameters
column_labelstring

The label of the column whose “best” value at each circuit shape is to be selected. For example, this could be “Qubits”, to select only the data for the best qubit subset at each circuit shape.

metricstring, optional

The data to be used as the figure-of-merit for performance at each (x, y). Must be a column of the dataframe.

statisticsstring, optional

The statistic to apply to the data specified by metric the data at (x, y) into a scalar. Allowed values are: - ‘max’ - ‘min’ - ‘mean’

indep_xbool, optional

If True, then an independent value, for the column, is selected at each (x, y) value. If False, then the same value for the column is selected for every x value for a given y.

thresholdfloat, optional.

Does nothing if indep_x is True. If indep_x is False, then ‘metric’ and ‘statistic’ are not enough to uniquely decide which column value is best. In this case, the value is chosen that, for each y in (x,y), maximizes the x value at which the figure-of-merit (as specified by the metric and statistic) drops below the threshold. If there are multiple values that drop below the threshold at the same x (or the figure-of-merit never drops below the threshold for multiple values), then value with the larger figure-of-merit at that x is chosen.

Returns
VBDataFrame

A new VBDataFrame that has had the filtering applied to its dataframe.

vb_data(metric='polarization', statistic='mean', lower_cutoff=0.0, no_data_action='discard')

Converts the data into a dictionary, for plotting in a volumetric benchmarking plot. For each (x, y) value (as specified by the axes of this VBDataFrame, and typically circuit shape), pools all of the data specified by metric with that (x, y) and computes the statistic on that data defined by statistic.

Parameters
metricstring, optional

The type of data. Must be a column of the dataframe.

statisticsstring, optional

The statistic on the data to be computed at each value of (x, y). Options are:

  • ‘max’: the maximum

  • ‘min’: the minimum.

  • ‘mean’: the mean.

  • ‘monotonic_max’: the maximum of all the data with (x, y) values that are that large or larger

  • ‘monotonic_min’: the minimum of all the data with (x, y) values that are that small or smaller

All these options ignore nan values.

lower_cutofffloat, optional

The value to cutoff the statistic at: takes the maximum of the calculated static and this value.

no_data_action: string, optional

Sets what to do when there is no data, or only NaN data, at an (x, y) value:

  • If ‘discard’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary

  • If ‘nan’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be NaN.

  • If ‘min’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be the minimal value allowed for this statistic, as specified by lower_cutoff.

Returns
dict

A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are the VB data at that (x, y).

capability_regions(metric='polarization', threshold=1 / _np.e, significance=0.05, monotonic=True, nan_data_action='discard')

Computes a “capability region” from the data, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294. Classifies each (x,y) value (as specified by the x and y axes of the VBDataFrame, which are typically width and depth) as either “success” (the int 2), “indeterminate” (the int 1), “fail” (the int 0), or “no data” (NaN).

Parameters
metricstring, optional

The type of data. Must be ‘polarization’ or ‘success_probability’, and this must be a column in the dataframe.

thresholdfloat, optional

The threshold for “success”.

significancefloat, optional

The statistical significance for the hypothesis tests that are used to classify each circuit shape.

monotonicbool, optional

If True, makes the region monotonic, i,e, if (x’,y’) > (x,y) then the classification for (x’,y’) is less/worse than for (x,y).

no_data_actionstring, optional

If ‘discard’ then when there is no data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary. Otherwise the value will be NaN.

Returns
dict

A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are in (2, 1, 0, NaN).

create_vb_plot(title: str, accumulator: Callable = _np.mean, cp_or_rc: Literal['cp', 'rc'] = 'rc', show_dropped_gates: bool = False, dg_accumulator: Callable = _np.mean, cmap: matplotlib.colors.Colormap = None, margin: float = 0.15, save_fig: bool = False, fig_path: str = None, fig_format: str = None)

Generate process fidelity volumetric benchmarking (VB) plot from dataframe. This function is designed with subcircuit volumetric benchmarking in mind, where the x-axis is the subicrcuit depth and the y-axis is the subcircuit width.

Parameters
titlestr

The title of the plot.

accumulatorcallable, optional

Function used to accumulate process fidelities for a (width, depth) pair on the VB plot. Default is np.mean.

cp_or_rcstr, optional

Whether the process fidelities were computed via randomly compiled circuits (‘rc’) or central Pauli mirroring (‘cp’). Default is ‘rc’.

show_dropped_gatesbool, optional

Whether the plot should visualize the average (see dg_accumulator) number of dropped gates for each subcircuit width-depth pair. Subcircuit sampling can drop gates when a gate has only partial support on the qubits in the selected width subset.

dg_accumulatorcallable, optional

Function used to accumulate the dropped gate counts for a (width, depth) pair on the VB plot. Default is np.mean.

cmapmatplotlib.colors.Colormap, optional

Colormap to use for plotting process fidelities. Default is spectral.

marginfloat, optional

Margin between adjacent width-depth pairs in the VB plot. Default is 0.15.

save_figbool, optional

Whether to write the generated VB plot to file. Default is False.

fig_pathstr, optional

If save_fig is set to True, this argument is used as the path the figure is saved to. If save_fig is False, this argument is ignored. Default is None

fig_formatstr, optional

If save_fig is set to True, this argument is the file format for the generated VB plot. If save_fig is False, this argument is ignored. Acceptable values are any file format recognized by matplotlib.