pygsti.protocols.vbdataframe
Techniques for manipulating benchmarking data stored in a Pandas DataFrame.
Module Contents
Classes
A class for storing a DataFrame that contains volumetric benchmarking data, and that |
Functions
|
Utility function for computing "capability regions", as introduced in "Measuring the |
- pygsti.protocols.vbdataframe.classify_circuit_shape(success_probabilities, total_counts, threshold, significance=0.05)
Utility function for computing “capability regions”, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294.
Returns an integer that classifies the input list of success probabilities (SPs) as either
“success”: all SPs above the specified threshold, specified by the int 2.
“indeterminate”: some SPs are above and some are below the threshold, specified by the int 1.
“fail”: all SPs are below the threshold, specified by the int 0.
This classification is based on a hypothesis test whereby the null hypothesis is “success” or “fail”. That is, the set of success probabilities are designated to be “indeterminate” only if there is statistically significant evidence that at least one success probabilities is above the threshold, and at least one is below. The details of this hypothesis testing are given in the Section 8.B.5 in the Supplement of arXiv:2008.11294.
Parameters
- success_probabilitieslist
List of success probabilities, should all be in [0,1].
- total_countslist
The number of samples from which the success probabilities where computed.
- thresholdfloat
The threshold for designating a success probability as “success”.
- significancefloat, optional
The statistical significance for the hypothesis test.
Returns
int in (2, 1, 0), corresponding to (“success”, “indeterminate”, “fail”) classifications. If the SPs list is length 0 then NaN is returned, and if it contains only NaN elements then 0 is returned. Otherwise, all NaN elements are ignored.
- class pygsti.protocols.vbdataframe.VBDataFrame(df: pandas.DataFrame, x_axis: str = 'Depth', y_axis: str = 'Width', x_values: str | None = None, y_values: str | None = None, edesign: pygsti.protocols.ExperimentDesign | None = None)
Bases:
objectA class for storing a DataFrame that contains volumetric benchmarking data, and that has methods for manipulating that data and creating volumetric-benchmarking-like plots.
Initialize a VBDataFrame object.
Parameters
- dfPandas DataFrame
A DataFrame that contains the volumetric benchmarking data. This sort of DataFrame can be created using ByBepthSummaryStatics protocols and the to_dataframe() method of the created results object.
- x_axisstring, optional
A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the x-axis of these plots should be. It should be a column label in the DataFrame.
- y_axisstring, optional
A VBDataFrame is intended to create volumetric-benchmarking-like plots where performance is plotted on an (x, y) grid. This specifies what the y-axis of these plots should be. It should be a column label in the DataFrame.
x_values : string or None, optional
x_values : string or None, optional
- edesignExperimentDesign or None, optional
The ExperimentDesign that corresponds to the data in the dataframe. This is not currently used by any methods in the VBDataFrame.
- dataframe
- x_axis = "'Depth'"
- y_axis = "'Width'"
- edesign = 'None'
- classmethod from_mirror_experiment(unmirrored_design: pygsti.protocols.FreeformDesign, mirrored_data: pygsti.protocols.ProtocolData, include_dropped_gates: bool = False, bootstrap: bool = True, num_bootstraps: int = 50, rand_state: numpy.random.RandomState | None = None, verbose: bool = False) VBDataFrame
Create a dataframe from MCFE data and edesigns.
Parameters
- unmirrored_design: pygsti.protocols.protocol.FreeformDesign
Edesign containing the circuits whose process fidelities are to be estimated.
- mirrored_data: pygsti.protocols.protocol.ProtocolData
Data object containing the full mirror edesign and the outcome counts for each circuit in the full mirror edesign.
- include_dropped_gates: bool
Whether to include the number of gates dropped from each subcircuit during subcircuit creation. This flag should be set to False for noise benchmark and fullstack benchmark analysis, but can be set to True for subcircuit benchmark analysis. Default is False.
- bootstrap: bool
Toggle the calculation of error bars from bootstrapped process fidelity calculations. If True, error bars are calculated. If False, error bars are not calculated.
- num_bootstraps: int
Number of samples to draw from the bootstrapped process fidelity calculations. This argument is ignored if ‘bootstrap’ is False.
- rand_state: np.random.RandomState
random state used to seed bootstrapping. If ‘bootstrap’ is set to False, this argument is ignored.
- verbose: bool
Toggle print statements with debug information. If True, print statements are turned on. If False, print statements are omitted.
Returns
- VBDataFrame
A VBDataFrame whose dataframe contains calculated MCFE values and circuit statistics.
- select_column_value(column_label, column_value)
Filters the dataframe, by discarding all rows of the dataframe for which the column labelled column_label does not have column_value.
Parameters
- column_labelstring
The label of the column whose value is to be filtered on.
- column_valuevaried
The value of the column.
Returns
- VBDataFrame
A new VBDataFrame that has had the filtering applied to its dataframe.
- filter_data(column_label, metric='polarization', statistic='mean', indep_x=True, threshold=1 / _np.e, verbosity=0)
Filters the dataframe, by selecting the “best” value at each (x, y) (typically corresponding to circuit shape) for the column specified by column_label. Returns a VBDataFrame whose data that contains only one value for the column labelled by column_label for each (x, y).
Parameters
- column_labelstring
The label of the column whose “best” value at each circuit shape is to be selected. For example, this could be “Qubits”, to select only the data for the best qubit subset at each circuit shape.
- metricstring, optional
The data to be used as the figure-of-merit for performance at each (x, y). Must be a column of the dataframe.
- statisticsstring, optional
The statistic to apply to the data specified by metric the data at (x, y) into a scalar. Allowed values are: - ‘max’ - ‘min’ - ‘mean’
- indep_xbool, optional
If True, then an independent value, for the column, is selected at each (x, y) value. If False, then the same value for the column is selected for every x value for a given y.
- thresholdfloat, optional.
Does nothing if indep_x is True. If indep_x is False, then ‘metric’ and ‘statistic’ are not enough to uniquely decide which column value is best. In this case, the value is chosen that, for each y in (x,y), maximizes the x value at which the figure-of-merit (as specified by the metric and statistic) drops below the threshold. If there are multiple values that drop below the threshold at the same x (or the figure-of-merit never drops below the threshold for multiple values), then value with the larger figure-of-merit at that x is chosen.
Returns
- VBDataFrame
A new VBDataFrame that has had the filtering applied to its dataframe.
- vb_data(metric='polarization', statistic='mean', lower_cutoff=0.0, no_data_action='discard')
Converts the data into a dictionary, for plotting in a volumetric benchmarking plot. For each (x, y) value (as specified by the axes of this VBDataFrame, and typically circuit shape), pools all of the data specified by metric with that (x, y) and computes the statistic on that data defined by statistic.
Parameters
- metricstring, optional
The type of data. Must be a column of the dataframe.
- statisticsstring, optional
The statistic on the data to be computed at each value of (x, y). Options are:
‘max’: the maximum
‘min’: the minimum.
‘mean’: the mean.
‘monotonic_max’: the maximum of all the data with (x, y) values that are that large or larger
‘monotonic_min’: the minimum of all the data with (x, y) values that are that small or smaller
All these options ignore nan values.
- lower_cutofffloat, optional
The value to cutoff the statistic at: takes the maximum of the calculated static and this value.
- no_data_action: string, optional
Sets what to do when there is no data, or only NaN data, at an (x, y) value:
If ‘discard’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary
If ‘nan’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be NaN.
If ‘min’ then when there is no data, or only NaN data, for an (x,y) value then this (x,y) value will be a key in the returned dictionary and its value will be the minimal value allowed for this statistic, as specified by lower_cutoff.
Returns
- dict
A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are the VB data at that (x, y).
- capability_regions(metric='polarization', threshold=1 / _np.e, significance=0.05, monotonic=True, nan_data_action='discard')
Computes a “capability region” from the data, as introduced in “Measuring the Capabilities of Quantum Computers” arXiv:2008.11294. Classifies each (x,y) value (as specified by the x and y axes of the VBDataFrame, which are typically width and depth) as either “success” (the int 2), “indeterminate” (the int 1), “fail” (the int 0), or “no data” (NaN).
Parameters
- metricstring, optional
The type of data. Must be ‘polarization’ or ‘success_probability’, and this must be a column in the dataframe.
- thresholdfloat, optional
The threshold for “success”.
- significancefloat, optional
The statistical significance for the hypothesis tests that are used to classify each circuit shape.
- monotonicbool, optional
If True, makes the region monotonic, i,e, if (x’,y’) > (x,y) then the classification for (x’,y’) is less/worse than for (x,y).
- no_data_actionstring, optional
If ‘discard’ then when there is no data, for an (x,y) value then this (x,y) value will not be a key in the returned dictionary. Otherwise the value will be NaN.
Returns
- dict
A dictionary where the keys are (x,y) tuples (typically circuit shapes) and the values are in (2, 1, 0, NaN).
- create_vb_plot(title: str, accumulator: Callable = _np.mean, cp_or_rc: Literal['cp', 'rc'] = 'rc', show_dropped_gates: bool = False, dg_accumulator: Callable = _np.mean, cmap: matplotlib.colors.Colormap = None, margin: float = 0.15, save_fig: bool = False, fig_path: str = None, fig_format: str = None)
Generate process fidelity volumetric benchmarking (VB) plot from dataframe. This function is designed with subcircuit volumetric benchmarking in mind, where the x-axis is the subicrcuit depth and the y-axis is the subcircuit width.
Parameters
- titlestr
The title of the plot.
- accumulatorcallable, optional
Function used to accumulate process fidelities for a (width, depth) pair on the VB plot. Default is np.mean.
- cp_or_rcstr, optional
Whether the process fidelities were computed via randomly compiled circuits (‘rc’) or central Pauli mirroring (‘cp’). Default is ‘rc’.
- show_dropped_gatesbool, optional
Whether the plot should visualize the average (see dg_accumulator) number of dropped gates for each subcircuit width-depth pair. Subcircuit sampling can drop gates when a gate has only partial support on the qubits in the selected width subset.
- dg_accumulatorcallable, optional
Function used to accumulate the dropped gate counts for a (width, depth) pair on the VB plot. Default is np.mean.
- cmapmatplotlib.colors.Colormap, optional
Colormap to use for plotting process fidelities. Default is spectral.
- marginfloat, optional
Margin between adjacent width-depth pairs in the VB plot. Default is 0.15.
- save_figbool, optional
Whether to write the generated VB plot to file. Default is False.
- fig_pathstr, optional
If save_fig is set to True, this argument is used as the path the figure is saved to. If save_fig is False, this argument is ignored. Default is None
- fig_formatstr, optional
If save_fig is set to True, this argument is the file format for the generated VB plot. If save_fig is False, this argument is ignored. Acceptable values are any file format recognized by matplotlib.