pygsti.data.dataset
¶
Defines the DataSet class and supporting classes and functions
Module Contents¶
Classes¶
Iterator class for op_string,_DataSetRow pairs of a DataSet 

Iterator class for _DataSetRow values of a DataSet 

Encapsulates DataSet time series data for a single circuit. 

An association between Circuits and outcome counts, serving as the input data for many QCVV protocols. 
Functions¶

Helper function to localize warning message 
Attributes¶
 pygsti.data.dataset.Oindex_type¶
 pygsti.data.dataset.Time_type¶
 pygsti.data.dataset.Repcount_type¶
 pygsti.data.dataset._DATAROW_AUTOCACHECOUNT_THRESHOLD = 256¶
 class pygsti.data.dataset._DataSetKVIterator(dataset)¶
Bases:
object
Iterator class for op_string,_DataSetRow pairs of a DataSet
 Parameters
dataset (DataSet) – The parent data set.
 next¶
 __iter__(self)¶
 __next__(self)¶
 class pygsti.data.dataset._DataSetValueIterator(dataset)¶
Bases:
object
Iterator class for _DataSetRow values of a DataSet
 Parameters
dataset (DataSet) – The parent data set.
 next¶
 __iter__(self)¶
 __next__(self)¶
 class pygsti.data.dataset._DataSetRow(dataset, row_oli_data, row_time_data, row_rep_data, cached_cnts, aux)¶
Bases:
object
Encapsulates DataSet time series data for a single circuit.
Outwardly, it looks similar to a list with (outcome_label, time_index, repetition_count) tuples as the values.
 Parameters
dataset (DataSet) – The parent data set.
row_oli_data (numpy.ndarray) – The outcome label indices for each bin of this row.
row_time_data (numpy.ndarray) – The timestamps for each bin of this row.
row_rep_data (numpy.ndarray) – The repetition counts for each bin of this row (if None, assume 1 per bin).
cached_cnts (dict) – A cached precomputed count dictionary (for speed).
aux (dict) – Dictionary of auxiliary information.
 outcomes¶
Returns this row’s sequence of outcome labels, one per “bin” of repetition counts (returned by :method:`get_counts`).
 Type
list
 counts¶
a dictionary of peroutcome counts.
 Type
dict
 allcounts¶
a dictionary of peroutcome counts with all possible outcomes as keys and zero values when an outcome didn’t occur. Note this can be expensive to compute for manyqubit data.
 Type
dict
 fractions¶
a dictionary of peroutcome fractions.
 Type
dict
 total¶
Returns the total number of counts contained in this row.
 Type
int
 property outcomes(self)¶
This row’s sequence of outcome labels, one per “bin” of repetition counts.
 property unique_outcomes(self)¶
This row’s unique set of outcome labels, as a list
 property expanded_ol(self)¶
This row’s sequence of outcome labels, with repetition counts expanded.
Thus, there’s one element in the returned list for each count.
 Returns
list
 property expanded_oli(self)¶
This row’s sequence of outcome label indices, with repetition counts expanded.
Thus, there’s one element in the returned list for each count.
 Returns
numpy.ndarray
 property expanded_times(self)¶
This row’s sequence of time stamps, with repetition counts expanded.
Thus, there’s one element in the returned list for each count.
 Returns
numpy.ndarray
 property times(self)¶
A list containing the unique data collection times at which there is at least one measurement result.
 Returns
list
 property timeseries_for_outcomes(self)¶
Row data in a timeseries format.
This can be a much less succinct format than returned by counts_as_timeseries. E.g., it is highly inefficient for manyqubit data.
 Returns
times (list) – The time steps, containing the unique data collection times.
reps (dict) – A dictionary of lists containing the number of times each measurement outcome was observed at the unique data collection times in times.
 counts_as_timeseries(self)¶
Returns data in a timeseries format.
 Returns
times (list) – The time steps, containing the unique data collection times.
reps (list) – A list of dictionaries containing the counts dict corresponding to the list of unique data collection times in times.
 property reps_timeseries(self)¶
The number of measurement results at each data collection time.
 Returns
times (list) – The time steps.
reps (list) – The total number of counts at each time step.
 property number_of_times(self)¶
Returns the number of data collection times.
 Returns
int
 property has_constant_totalcounts(self)¶
True if the numbers of counts is the same at all data collection times. Otherwise False.
 Returns
bool
 property totalcounts_per_timestep(self)¶
The number of total counts per timestep, when this is constant.
If the total counts vary over the times that there is at least one measurement result, then this function will raise an error.
 Returns
int
 property meantimestep(self)¶
The mean timestep.
Will raise an error for data that is a trivial timeseries (i.e., data all at one time).
 Returns
float
 __iter__(self)¶
 __contains__(self, outcome_label)¶
Checks whether data counts for outcomelabel are available.
 __getitem__(self, index_or_outcome_label)¶
 __setitem__(self, index_or_outcome_label, val)¶
 get(self, index_or_outcome_label, default_value)¶
The the number of counts for an index or outcome label.
If the index or outcome is nor present, default_value is returned.
 Parameters
index_or_outcome_label (int or str or tuple) – The index or outcome label to lookup.
default_value (object) – The value to return if this data row doesn’t contain data at the given index.
 Returns
int or float
 _get_single_count(self, outcome_label, timestamp=None)¶
 _get_counts(self, timestamp=None, all_outcomes=False)¶
Returns this row’s sequence of “repetition counts”, that is, the number of repetitions of each outcome label in the outcomes list, or equivalently, each outcome label index in this rows .oli member.
 property counts(self)¶
Dictionary of peroutcome counts.
 property allcounts(self)¶
Dictionary of peroutcome counts with all possible outcomes as keys.
This means that and zero values are included when an outcome didn’t occur. Note this can be expensive to assemble for manyqubit data.
 property fractions(self, all_outcomes=False)¶
Dictionary of peroutcome fractions.
 property total(self)¶
The total number of counts contained in this row.
 fraction(self, outcomelabel)¶
The fraction of total counts for outcomelabel.
 Parameters
outcomelabel (str or tuple) – The outcome label, e.g. ‘010’ or (‘0’,’11’).
 Returns
float
 counts_at_time(self, timestamp)¶
Returns a dictionary of counts at a particular time
 Parameters
timestamp (float) – the time to get counts at.
 Returns
int
 timeseries(self, outcomelabel, timestamps=None)¶
Retrieve timestamps and counts for a single outcome label or for aggregated counts if outcomelabel == “all”.
 Parameters
outcomelabel (str or tuple) – The outcome label to extract a series for. If the special value “all” is used, total (aggregated over all outcomes) counts are returned.
timestamps (list or array, optional) – If not None, an array of time stamps to extract counts for, which will also be returned as times. Times at which there is no data will be returned as zerocounts.
 Returns
times, counts (numpy.ndarray)
 scale_inplace(self, factor)¶
Scales all the counts of this row by the given factor
 Parameters
factor (float) – scaling factor.
 Returns
None
 to_dict(self)¶
Returns the (outcomeLabel,count) pairs as a dictionary.
 Returns
dict
 to_str(self, mode='auto')¶
Render this _DataSetRow as a string.
 Parameters
mode ({"auto","timedependent","timeindependent"}) – Whether to display the data as timeseries of outcome counts (“timedependent”) or to report peroutcome counts aggregated over time (“timeindependent”). If “auto” is specified, then the timeindependent mode is used only if all time stamps in the _DataSetRow are equal (trivial time dependence).
 Returns
str
 __str__(self)¶
Return str(self).
 __len__(self)¶
 pygsti.data.dataset._round_int_repcnt(nreps)¶
Helper function to localize warning message
 class pygsti.data.dataset.DataSet(oli_data=None, time_data=None, rep_data=None, circuits=None, circuit_indices=None, outcome_labels=None, outcome_label_indices=None, static=False, file_to_load_from=None, collision_action='aggregate', comment=None, aux_info=None)¶
Bases:
object
An association between Circuits and outcome counts, serving as the input data for many QCVV protocols.
The DataSet class associates circuits with counts or time series of counts for each outcome label, and can be thought of as a table with gate strings labeling the rows and outcome labels and/or time labeling the columns. It is designed to behave similarly to a dictionary of dictionaries, so that counts are accessed by:
count = dataset[circuit][outcomeLabel]
in the timeindependent case, and in the timedependent case, for integer time index i >= 0,
outcomeLabel = dataset[circuit][i].outcome count = dataset[circuit][i].count time = dataset[circuit][i].time
 Parameters
oli_data (list or numpy.ndarray) – When static == True, a 1D numpy array containing outcome label indices (integers), concatenated for all sequences. Otherwise, a list of 1D numpy arrays, one array per gate sequence. In either case, this quantity is indexed by the values of circuit_indices or the index of circuits.
time_data (list or numpy.ndarray) – Same format at oli_data except stores floatingpoint timestamp values.
rep_data (list or numpy.ndarray) – Same format at oli_data except stores integer repetition counts for each “data bin” (i.e. (outcome,time) pair). If all repetitions equal 1 (“singleshot” timestampted data), then rep_data can be None (no repetitions).
circuits (list of (tuples or Circuits)) – Each element is a tuple of operation labels or a Circuit object. Indices for these strings are assumed to ascend from 0. These indices must correspond to the time series of spamlabel indices (above). Only specify this argument OR circuit_indices, not both.
circuit_indices (ordered dictionary) – An OrderedDict with keys equal to circuits (tuples of operation labels) and values equal to integer indices associating a row/element of counts with the circuit. Only specify this argument OR circuits, not both.
outcome_labels (list of strings or int) – Specifies the set of spam labels for the DataSet. Indices for the spam labels are assumed to ascend from 0, starting with the first element of this list. These indices will associate each elememtn of timeseries with a spam label. Only specify this argument OR outcome_label_indices, not both. If an int, specifies that the outcome labels should be those for a standard set of this many qubits.
outcome_label_indices (ordered dictionary) – An OrderedDict with keys equal to spam labels (strings) and value equal to integer indices associating a spam label with given index. Only specify this argument OR outcome_labels, not both.
static (bool) –
 When True, create a readonly, i.e. “static” DataSet which cannot be modified. In
this case you must specify the timeseries data, circuits, and spam labels.
 When False, create a DataSet that can have time series data added to it. In this case,
you only need to specify the spam labels.
file_to_load_from (string or file object) – Specify this argument and no others to create a static DataSet by loading from a file (just like using the load(…) function).
collision_action ({"aggregate","overwrite","keepseparate"}) – Specifies how duplicate circuits should be handled. “aggregate” adds duplicatecircuit counts to the same circuit’s data at the next integer timestamp. “overwrite” only keeps the latest given data for a circuit. “keepseparate” tags duplicatecircuits by setting the .occurrence ID of added circuits that are already contained in this data set to the next available positive integer.
comment (string, optional) – A userspecified comment string that gets carried around with the data. A common use for this field is to attach to the data details regarding its collection.
aux_info (dict, optional) – A userspecified dictionary of percircuit auxiliary information. Keys should be the circuits in this DataSet and value should be Python dictionaries.
 __iter__(self)¶
 __len__(self)¶
 __contains__(self, circuit)¶
Test whether data set contains a given circuit.
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels or a Circuit instance which specifies the the circuit to check for.
 Returns
bool – whether circuit was found.
 __hash__(self)¶
Return hash(self).
 __getitem__(self, circuit)¶
 __setitem__(self, circuit, outcome_dict_or_series)¶
 __delitem__(self, circuit)¶
 _get_row(self, circuit)¶
Get a row of data from this DataSet.
 Parameters
circuit (Circuit or tuple) – The gate sequence to extract data for.
 Returns
_DataSetRow
 _set_row(self, circuit, outcome_dict_or_series)¶
Set the counts for a row of this DataSet.
 Parameters
circuit (Circuit or tuple) – The gate sequence to extract data for.
outcome_dict_or_series (dict or tuple) – The outcome count data, either a dictionary of outcome counts (with keys as outcome labels) or a tuple of lists. In the latter case this can be a 2tuple: (outcomelabellist, timestamplist) or a 3tuple: (outcomelabellist, timestamplist, repetitioncountlist).
 Returns
None
 keys(self)¶
Returns the circuits used as keys of this DataSet.
 Returns
list – A list of Circuit objects which index the data counts within this data set.
 items(self)¶
Iterator over (circuit, timeSeries) pairs.
Here circuit is a tuple of operation labels and timeSeries is a
_DataSetRow
instance, which behaves similarly to a list of spam labels whose index corresponds to the time step. Returns
_DataSetKVIterator
 values(self)¶
Iterator over _DataSetRow instances corresponding to the time series data for each circuit.
 Returns
_DataSetValueIterator
 property outcome_labels(self)¶
Get a list of all the outcome labels contained in this DataSet.
 Returns
list of strings or tuples – A list where each element is an outcome label (which can be a string or a tuple of strings).
 property timestamps(self)¶
Get a list of all the (unique) timestamps contained in this DataSet.
 Returns
list of floats – A list where each element is a timestamp.
 gate_labels(self, prefix='G')¶
Get a list of all the distinct operation labels used in the circuits of this dataset.
 Parameters
prefix (str) – Filter the circuit labels so that only elements beginning with this prefix are returned. None performs no filtering.
 Returns
list of strings – A list where each element is a operation label.
 degrees_of_freedom(self, circuits=None, method='present_outcomes1', aggregate_times=True)¶
Returns the number of independent degrees of freedom in the data for the circuits in circuits.
 Parameters
circuits (list of Circuits) – The list of circuits to count degrees of freedom for. If None then all of the DataSet’s strings are used.
method ({'all_outcomes1', 'present_outcomes1', 'tuned'}) – How the degrees of freedom should be computed. ‘all_outcomes1’ takes the number of circuits and multiplies this by the total number of outcomes (the length of what is returned by outcome_labels()) minus one. ‘present_outcomes1’ counts on a percircuit basis the number of present (usually = nonzero) outcomes recorded minus one. ‘tuned’ should be the most accurate, as it accounts for lowN “Poisson bump” behavior, but it is not the default because it is still under development. For timestamped data, see aggreate_times below.
aggregate_times (bool, optional) – Whether counts that occur at different times should be tallied separately. If True, then even when counts occur at different times degrees of freedom are tallied on a percircuit basis. If False, then counts occuring at distinct times are treated as independent of those an any other time, and are tallied separately. So, for example, if aggregate_times is False and a data row has 0 and 1counts of 45 & 55 at time=0 and 42 and 58 at time=1 this row would contribute 2 degrees of freedom, not 1. It can sometimes be useful to set this to False when the DataSet holds coarsegrained data, but usually you want this to be left as True (especially for timeseries data).
 Returns
int
 _collisionaction_update_circuit(self, circuit)¶
 _add_explicit_repetition_counts(self)¶
Build internal repetition counts if they don’t exist already.
This method is usually unnecessary, as repetition counts are almost always build as soon as they are needed.
 Returns
None
 add_count_dict(self, circuit, count_dict, record_zero_counts=True, aux=None, update_ol=True)¶
Add a single circuit’s counts to this DataSet
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object
count_dict (dict) – A dictionary with keys = outcome labels and values = counts
record_zero_counts (bool, optional) – Whether zerocounts are actually recorded (stored) in this DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
aux (dict, optional) – A dictionary of auxiliary meta information to be included with this set of data counts (associated with circuit).
update_ol (bool, optional) – This argument is for internal use only and should be left as True.
 Returns
None
 add_count_list(self, circuit, outcome_labels, counts, record_zero_counts=True, aux=None, update_ol=True, unsafe=False)¶
Add a single circuit’s counts to this DataSet
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object
outcome_labels (list or tuple) – The outcome labels corresponding to counts.
counts (list or tuple) – The counts themselves.
record_zero_counts (bool, optional) – Whether zerocounts are actually recorded (stored) in this DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
aux (dict, optional) – A dictionary of auxiliary meta information to be included with this set of data counts (associated with circuit).
update_ol (bool, optional) – This argument is for internal use only and should be left as True.
unsafe (bool, optional) – True means that outcome_labels is guaranteed to hold tupletype outcome labels and never plain strings. Only set this to True if you know what you’re doing.
 Returns
None
 add_count_arrays(self, circuit, outcome_index_array, count_array, record_zero_counts=True, aux=None)¶
Add the outcomes for a single circuit, formatted as raw data arrays.
 Parameters
circuit (Circuit) – The circuit to add data for.
outcome_index_array (numpy.ndarray) – An array of outcome indices, which must be values of self.olIndex (which maps outcome labels to indices).
count_array (numpy.ndarray) – An array of integer (or sometimes floating point) counts, one corresponding to each outcome index (element of outcome_index_array).
record_zero_counts (bool, optional) – Whether zero counts (zeros in count_array should be stored explicitly or not stored and inferred. Setting to False reduces the space taken by data sets containing lots of zero counts, but makes some objective function evaluations less precise.
aux (dict or None, optional) – If not None a dictionary of userdefined auxiliary information that should be associated with this circuit.
 Returns
None
 add_cirq_trial_result(self, circuit, trial_result, key)¶
Add a single circuit’s counts — stored in a Cirq TrialResult — to this DataSet
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object. Note that this must be a PyGSTi circuit — not a Cirq circuit.
trial_result (cirq.TrialResult) – The TrialResult to add
key (str) – The string key of the measurement. Set by cirq.measure.
 Returns
None
 add_raw_series_data(self, circuit, outcome_label_list, time_stamp_list, rep_count_list=None, overwrite_existing=True, record_zero_counts=True, aux=None, update_ol=True, unsafe=False)¶
Add a single circuit’s counts to this DataSet
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object
outcome_label_list (list) – A list of outcome labels (strings or tuples). An element’s index links it to a particular time step (i.e. the ith element of the list specifies the outcome of the ith measurement in the series).
time_stamp_list (list) – A list of floating point timestamps, each associated with the single corresponding outcome in outcome_label_list. Must be the same length as outcome_label_list.
rep_count_list (list, optional) – A list of integer counts specifying how many outcomes of type given by outcome_label_list occurred at the time given by time_stamp_list. If None, then all counts are assumed to be 1. When not None, must be the same length as outcome_label_list.
overwrite_existing (bool, optional) – Whether to overwrite the data for circuit (if it exists). If False, then the given lists are appended (added) to existing data.
record_zero_counts (bool, optional) – Whether zerocounts (elements of rep_count_list that are zero) are actually recorded (stored) in this DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
aux (dict, optional) – A dictionary of auxiliary meta information to be included with this set of data counts (associated with circuit).
update_ol (bool, optional) – This argument is for internal use only and should be left as True.
unsafe (bool, optional) – When True, don’t bother checking that outcome_label_list contains tupletype outcome labels and automatically upgrading strings to 1tuples. Only set this to True if you know what you’re doing and need the marginally faster performance.
 Returns
None
 _add_raw_arrays(self, circuit, oli_array, time_array, rep_array, overwrite_existing, record_zero_counts, aux)¶
 update_ol(self)¶
Updates the internal outcomelabel list in this dataset.
Call this after calling add_count_dict(…) or add_raw_series_data(…) with update_olIndex=False.
 Returns
None
 add_series_data(self, circuit, count_dict_list, time_stamp_list, overwrite_existing=True, record_zero_counts=True, aux=None)¶
Add a single circuit’s counts to this DataSet
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object
count_dict_list (list) – A list of dictionaries holding the outcomelabel:count pairs for each time step (times given by time_stamp_list.
time_stamp_list (list) – A list of floating point timestamps, each associated with an entire dictionary of outcomes specified by count_dict_list.
overwrite_existing (bool, optional) – If True, overwrite any existing data for the circuit. If False, add the count data with the next nonnegative integer timestamp.
record_zero_counts (bool, optional) – Whether zerocounts (elements of the dictionaries in count_dict_list that are zero) are actually recorded (stored) in this DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
aux (dict, optional) – A dictionary of auxiliary meta information to be included with this set of data counts (associated with circuit).
 Returns
None
 aggregate_outcomes(self, label_merge_dict, record_zero_counts=True)¶
Creates a DataSet which merges certain outcomes in this DataSet.
Used, for example, to aggregate a 2qubit 4outcome DataSet into a 1qubit 2outcome DataSet.
 Parameters
label_merge_dict (dictionary) – The dictionary whose keys define the new DataSet outcomes, and whose items are lists of input DataSet outcomes that are to be summed together. For example, if a twoqubit DataSet has outcome labels “00”, “01”, “10”, and “11”, and we want to ‘’aggregate out’’ the second qubit, we could use label_merge_dict = {‘0’:[‘00’,’01’],’1’:[‘10’,’11’]}. When doing this, however, it may be better to use :function:`filter_qubits` which also updates the circuits.
record_zero_counts (bool, optional) – Whether zerocounts are actually recorded (stored) in the returned (merged) DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
 Returns
merged_dataset (DataSet object) – The DataSet with outcomes merged according to the rules given in label_merge_dict.
 aggregate_std_nqubit_outcomes(self, qubit_indices_to_keep, record_zero_counts=True)¶
Creates a DataSet which merges certain outcomes in this DataSet.
Used, for example, to aggregate a 2qubit 4outcome DataSet into a 1qubit 2outcome DataSet. This assumes that outcome labels are in the standard format whereby each qubit corresponds to a single ‘0’ or ‘1’ character.
 Parameters
qubit_indices_to_keep (list) – A list of integers specifying which qubits should be kept, that is, not aggregated.
record_zero_counts (bool, optional) – Whether zerocounts are actually recorded (stored) in the returned (merged) DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
 Returns
merged_dataset (DataSet object) – The DataSet with outcomes merged.
 add_auxiliary_info(self, circuit, aux)¶
Add auxiliary meta information to circuit.
 Parameters
circuit (tuple or Circuit) – A tuple of operation labels specifying the circuit or a Circuit object
aux (dict, optional) – A dictionary of auxiliary meta information to be included with this set of data counts (associated with circuit).
 Returns
None
 add_counts_from_dataset(self, other_data_set)¶
Append another DataSet’s data to this DataSet
 Parameters
other_data_set (DataSet) – The dataset to take counts from.
 Returns
None
 add_series_from_dataset(self, other_data_set)¶
Append another DataSet’s series data to this DataSet
 Parameters
other_data_set (DataSet) – The dataset to take time series data from.
 Returns
None
 property meantimestep(self)¶
The mean timestep, averaged over the timestep for each circuit and over circuits.
 Returns
float
 property has_constant_totalcounts_pertime(self)¶
True if the data for every circuit has the same number of total counts at every data collection time.
This will return True if there is a different number of total counts per circuit (i.e., after aggregating over time), as long as every circuit has the same total counts per time step (this will happen when the number of timesteps varies between circuit).
 Returns
bool
 property totalcounts_pertime(self)¶
Total counts per time, if this is constant over times and circuits.
When that doesn’t hold, an error is raised.
 Returns
float or int
 property has_constant_totalcounts(self)¶
True if the data for every circuit has the same number of total counts.
 Returns
bool
 property has_trivial_timedependence(self)¶
True if all the data in this DataSet occurs at time 0.
 Returns
bool
 __str__(self)¶
Return str(self).
 to_str(self, mode='auto')¶
Render this DataSet as a string.
 Parameters
mode ({"auto","timedependent","timeindependent"}) – Whether to display the data as timeseries of outcome counts (“timedependent”) or to report peroutcome counts aggregated over time (“timeindependent”). If “auto” is specified, then the timeindependent mode is used only if all time stamps in the DataSet are equal to zero (trivial time dependence).
 Returns
str
 truncate(self, list_of_circuits_to_keep, missing_action='raise')¶
Create a truncated dataset comprised of a subset of the circuits in this dataset.
 Parameters
list_of_circuits_to_keep (list of (tuples or Circuits)) – A list of the circuits for the new returned dataset. If a circuit is given in this list that isn’t in the original data set, missing_action determines the behavior.
missing_action ({"raise","warn","ignore"}) – What to do when a string in list_of_circuits_to_keep is not in the data set (raise a KeyError, issue a warning, or do nothing).
 Returns
DataSet – The truncated data set.
 time_slice(self, start_time, end_time, aggregate_to_time=None)¶
Creates a DataSet by aggregating the counts within the [start_time,`end_time`) interval.
 Parameters
start_time (float) – The starting time.
end_time (float) – The ending time.
aggregate_to_time (float, optional) – If not None, a single timestamp to give all the data in the specified range, resulting in timeindependent DataSet. If None, then the original timestamps are preserved.
 Returns
DataSet
 split_by_time(self, aggregate_to_time=None)¶
Creates a dictionary of DataSets, each of which is a equaltime slice of this DataSet.
The keys of the returned dictionary are the distinct timestamps in this dataset.
 Parameters
aggregate_to_time (float, optional) – If not None, a single timestamp to give all the data in each returned data set, resulting in timeindependent `DataSet`s. If None, then the original timestamps are preserved.
 Returns
OrderedDict – A dictionary of
DataSet
objects whose keys are the timestamp values of the original (this) data set in sorted order.
 drop_zero_counts(self)¶
Creates a copy of this data set that doesn’t include any zero counts.
 Returns
DataSet
 process_times(self, process_times_array_fn)¶
Manipulate this DataSet’s timestamps according to processor_fn.
For example, using, the folloing process_times_array_fn would change the timestamps for each circuit to sequential integers.
``` def process_times_array_fn(times):
return list(range(len(times)))
 Parameters
process_times_array_fn (function) – A function which takes a single arrayoftimestamps argument and returns another similarlysized array. This function is called, once per circuit, with the circuit’s array of timestamps.
 Returns
DataSet – A new data set with altered timestamps.
 process_circuits(self, processor_fn, aggregate=False)¶
Create a new data set by manipulating this DataSet’s circuits (keys) according to processor_fn.
The new DataSet’s circuits result from by running each of this DataSet’s circuits through processor_fn. This can be useful when “tracing out” qubits in a dataset containing multiqubit data.
 Parameters
processor_fn (function) – A function which takes a single Circuit argument and returns another (or the same) Circuit. This function may also return None, in which case the data for that string is deleted.
aggregate (bool, optional) – When True, aggregate the data for ciruits that processor_fn assigns to the same “new” circuit. When False, use the data from the last original circuit that maps to a given “new” circuit.
 Returns
DataSet
 process_circuits_inplace(self, processor_fn, aggregate=False)¶
Manipulate this DataSet’s circuits (keys) inplace according to processor_fn.
All of this DataSet’s circuits are updated by running each one through processor_fn. This can be useful when “tracing out” qubits in a dataset containing multiqubit data.
 Parameters
processor_fn (function) – A function which takes a single Circuit argument and returns another (or the same) Circuit. This function may also return None, in which case the data for that string is deleted.
aggregate (bool, optional) – When True, aggregate the data for ciruits that processor_fn assigns to the same “new” circuit. When False, use the data from the last original circuit that maps to a given “new” circuit.
 Returns
None
 remove(self, circuits, missing_action='raise')¶
Remove (delete) the data for circuits from this
DataSet
. Parameters
circuits (iterable) – An iterable over Circuitlike objects specifying the keys (circuits) to remove.
missing_action ({"raise","warn","ignore"}) – What to do when a string in circuits is not in this data set (raise a KeyError, issue a warning, or do nothing).
 Returns
None
 _remove(self, gstr_indices)¶
Removes the data in indices given by gstr_indices
 copy(self)¶
Make a copy of this DataSet.
 Returns
DataSet
 copy_nonstatic(self)¶
Make a nonstatic copy of this DataSet.
 Returns
DataSet
 done_adding_data(self)¶
Promotes a nonstatic DataSet to a static (readonly) DataSet.
This method should be called after all data has been added.
 Returns
None
 __getstate__(self)¶
 __setstate__(self, state_dict)¶
 save(self, file_or_filename)¶
 write_binary(self, file_or_filename)¶
Write this data set to a binaryformat file.
 Parameters
file_or_filename (string or file object) – If a string, interpreted as a filename. If this filename ends in “.gz”, the file will be gzip compressed.
 Returns
None
 load(self, file_or_filename)¶
 read_binary(self, file_or_filename)¶
Read a DataSet from a binary file, clearing any data is contained previously.
The file should have been created with :method:`DataSet.write_binary`
 Parameters
file_or_filename (str or buffer) – The file or filename to load from.
 Returns
None
 rename_outcome_labels(self, old_to_new_dict)¶
Replaces existing output labels with new ones as per old_to_new_dict.
 Parameters
old_to_new_dict (dict) – A mapping from old/existing outcome labels to new ones. Strings in keys or values are automatically converted to 1tuples. Missing outcome labels are left unaltered.
 Returns
None
 add_std_nqubit_outcome_labels(self, nqubits)¶
Adds all the “standard” outcome labels (e.g. ‘0010’) on nqubits qubits.
This is useful to ensure that, even if not all outcomes appear in the data, that all are recognized as being potentially valid outcomes (and so attempts to get counts for these outcomes will be 0 rather than raising an error).
 Parameters
nqubits (int) – The number of qubits. For example, if equal to 3 the outcome labels ‘000’, ‘001’, … ‘111’ are added.
 Returns
None
 add_outcome_labels(self, outcome_labels, update_ol=True)¶
Adds new valid outcome labels.
Ensures that all the elements of outcome_labels are stored as valid outcomes for circuits in this DataSet, adding new outcomes as necessary.
 Parameters
outcome_labels (list or generator) – A list or generator of string or tuplevalued outcome labels.
update_ol (bool, optional) – Whether to update internal mappings to reflect the new outcome labels. Leave this as True unless you really know what you’re doing.
 Returns
None
 auxinfo_dataframe(self, pivot_valuename=None, pivot_value=None, drop_columns=False)¶
Create a Pandas dataframe with auxdata from this dataset.
 Parameters
pivot_valuename (str, optional) – If not None, the resulting dataframe is pivoted using pivot_valuename as the column whose values name the pivoted table’s column names. If None and pivot_value is not None,`”ValueName”` is used.
pivot_value (str, optional) – If not None, the resulting dataframe is pivoted such that values of the pivot_value column are rearranged into new columns whose names are given by the values of the pivot_valuename column. If None and pivot_valuename is not None,`”Value”` is used.
drop_columns (bool or list, optional) – A list of column names to drop (prior to performing any pivot). If True appears in this list or is given directly, then all constantvalued columns are dropped as well. No columns are dropped when drop_columns == False.
 Returns
pandas.DataFrame