pygsti.data.datasetconstruction
Functions for creating data
Module Contents
Functions

Creates a DataSet using the probabilities obtained from a model. 

Creates a DataSet which merges certain outcomes in input DataSet. 

Creates a DataSet that is the restriction of dataset to sectors_to_keep. 
Trims a 
 pygsti.data.datasetconstruction.simulate_data(model_or_dataset, circuit_list, num_samples, sample_error='multinomial', seed=None, rand_state=None, alias_dict=None, collision_action='aggregate', record_zero_counts=True, comm=None, mem_limit=None, times=None)
Creates a DataSet using the probabilities obtained from a model.
Parameters
 model_or_datasetModel or DataSet object
The source of the underlying probabilities used to generate the data. If a Model, the model whose probabilities generate the data. If a DataSet, the data set whose frequencies generate the data.
 circuit_listlist of (tuples or Circuits) or ExperimentDesign or None
Each tuple or Circuit contains operation labels and specifies a gate sequence whose counts are included in the returned DataSet. e.g.
[ (), ('Gx',), ('Gx','Gy') ]
If anExperimentDesign
, then the design’s .all_circuits_needing_data list is used as the circuit list. num_samplesint or list of ints or None
The simulated number of samples for each circuit. This only has effect when
sample_error == "binomial"
or"multinomial"
. If an integer, all circuits have this number of total samples. If a list, integer elements specify the number of samples for the corresponding circuit. IfNone
, then model_or_dataset must be aDataSet
, and total counts are taken from it (on a percircuit basis). sample_errorstring, optional
What type of sample error is included in the counts. Can be:
“none”  no sample error: counts are floating point numbers such that the exact probabilty can be found by the ratio of count / total.
“clip”  no sample error, but clip probabilities to [0,1] so, e.g., counts are always positive.
“round”  same as “clip”, except counts are rounded to the nearest integer.
“binomial”  the number of counts is taken from a binomial distribution. Distribution has parameters p = (clipped) probability of the circuit and n = number of samples. This can only be used when there are exactly two SPAM labels in model_or_dataset.
“multinomial”  counts are taken from a multinomial distribution. Distribution has parameters p_k = (clipped) probability of the gate string using the kth SPAM label and n = number of samples.
 seedint, optional
If not
None
, a seed for numpy’s random number generator, which is used to sample from the binomial or multinomial distribution. rand_statenumpy.random.RandomState
A RandomState object to generate samples from. Can be useful to set instead of seed if you want reproducible distribution samples across multiple random function calls but you don’t want to bother with manually incrementing seeds between those calls.
 alias_dictdict, optional
A dictionary mapping single operation labels into tuples of one or more other operation labels which translate the given circuits before values are computed using model_or_dataset. The resulting Dataset, however, contains the untranslated circuits as keys.
 collision_action{“aggregate”, “keepseparate”}
Determines how duplicate circuits are handled by the resulting DataSet. Please see the constructor documentation for DataSet.
 record_zero_countsbool, optional
Whether zerocounts are actually recorded (stored) in the returned DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
 commmpi4py.MPI.Comm, optional
When not
None
, an MPI communicator for distributing the computation across multiple processors and ensuring that the same dataset is generated on each processor. mem_limitint, optional
A rough memory limit in bytes which is used to determine job allocation when there are multiple processors.
 timesiterable, optional
When not None, a list of timestamps at which data should be sampled. num_samples samples will be simulated at each time value, meaning that each circuit in circuit_list will be evaluated with the given time value as its start time.
Returns
 DataSet
A static data set filled with counts for the specified circuits.
 pygsti.data.datasetconstruction.aggregate_dataset_outcomes(dataset, label_merge_dict, record_zero_counts=True)
Creates a DataSet which merges certain outcomes in input DataSet.
This is used, for example, to aggregate a 2qubit, 4outcome DataSet into a 1qubit, 2outcome DataSet.
Parameters
 datasetDataSet object
The input DataSet whose results will be simplified according to the rules set forth in label_merge_dict
 label_merge_dictdictionary
The dictionary whose keys define the new DataSet outcomes, and whose items are lists of input DataSet outcomes that are to be summed together. For example, if a twoqubit DataSet has outcome labels “00”, “01”, “10”, and “11”, and we want to ‘’aggregate out’’ the second qubit, we could use label_merge_dict = {‘0’:[‘00’,’01’],’1’:[‘10’,’11’]}. When doing this, however, it may be better to use
filter_dataset()
which also updates the circuits. record_zero_countsbool, optional
Whether zerocounts are actually recorded (stored) in the returned (merged) DataSet. If False, then zero counts are ignored, except for potentially registering new outcome labels.
Returns
 merged_datasetDataSet object
The DataSet with outcomes merged according to the rules given in label_merge_dict.
 pygsti.data.datasetconstruction.filter_dataset(dataset, sectors_to_keep, sindices_to_keep=None, new_sectors=None, idle=((),), record_zero_counts=True, filtercircuits=True)
Creates a DataSet that is the restriction of dataset to sectors_to_keep.
This function aggregates (sums) outcomes in dataset which differ only in sectors (usually qubits  see below) not in sectors_to_keep, and removes any operation labels which act specifically on sectors not in sectors_to_keep (e.g. an idle gate acting on all sectors because it’s .sslbls is None will not be removed).
Here “sectors” are statespace labels, present in the circuits of dataset. Each sector also corresponds to a particular character position within the outcomes labels of dataset. Thus, for this function to work, the outcome labels of dataset must all be 1tuples whose sole element is an ncharacter string such that each character represents the outcome of a single sector. If the statespace labels are integers, then they can serve as both a label and an outcomestring position. The argument new_sectors may be given to rename the kept statespace labels in the returned DataSet’s circuits.
A typical case is when the statespace is that of n qubits, and the state space labels the intergers 0 to n1. As stated above, in this case there is no need to specify sindices_to_keep. One may want to “rebase” the indices to 0 in the returned data set using new_sectors (E.g. sectors_to_keep == [4,5,6] and new_sectors == [0,1,2]).
Parameters
 datasetDataSet object
The input DataSet whose data will be processed.
 sectors_to_keeplist or tuple
The statespace labels (strings or integers) of the “sectors” to keep in the returned DataSet.
 sindices_to_keeplist or tuple, optional
The 0based indices of the labels in sectors_to_keep which give the postiions of the corresponding letters in each outcome string (see above). If the state space labels are integers (labeling qubits) thath are also letterpositions, then this may be left as None. For example, if the outcome strings of dataset are ‘00’,’01’,’10’,and ‘11’ and the first position refers to qubit “Q1” and the second to qubit “Q2” (present in operation labels), then to extract just “Q2” data sectors_to_keep should be [“Q2”] and sindices_to_keep should be [1].
 new_sectorslist or tuple, optional
New sectors names to map the elements of sectors_to_keep onto in the output DataSet’s circuits. None means the labels are not renamed. This can be useful if, for instance, you want to run a 2qubit protocol that expects the qubits to be labeled “0” and “1” on qubits “4” and “5” of a larger set. Simply set sectors_to_keep == [4,5] and new_sectors == [0,1].
 idlestring or Label, optional
The operation label to be used when there are no kept components of a “layer” (element) of a circuit.
 record_zero_countsbool, optional
Whether zerocounts present in the original dataset are recorded (stored) in the returned (filtered) DataSet. If False, then such zero counts are ignored, except for potentially registering new outcome labels.
 filtercircuitsbool, optional
Whether or not to “filter” the circuits, by removing gates that act outside of the sectors_to_keep.
Returns
 filtered_datasetDataSet object
The DataSet with outcomes and circuits filtered as described above.
 pygsti.data.datasetconstruction.trim_to_constant_numtimesteps(ds)
Trims a
DataSet
so that each circuit’s data comprises the same number of timesteps.Returns a new dataset that has data for the same number of time steps for every circuit. This is achieved by discarding all timeseries data for every circuit with a time step index beyond ‘mintimestepindex’, where ‘mintimestepindex’ is the minimum number of time steps over circuits.
Parameters
 dsDataSet
The dataset to trim.
Returns
 DataSet
The trimmed dataset, obtained by potentially discarding some of the data.