pygsti.extras.drift.signal
¶
Signal analysis functions for timeseries data
Module Contents¶
Functions¶

Generates a power spectrum from the input timeseries data. Before converting to a power 

Maps the vector x over [0, counts] as 

Inverts the standardizer function. 

Returns the TypeII discrete cosine transform of y, with an orthogonal normalization, where 

Inverts the dct function. 

Returns the discrete Fourier transform of y, with a unitary normalization, where 

Inverts the dft function. 

Performs a LombScargle periodogram (lsp) on the input data, returning powers 

Calculates the Bartlett power spectrum. This involves splitting the data into disjoint 

The omega`th DCT basis function, for a initial time of `starttime and a timestep of timedif, 

The multitest adjusted signficance statistical significance threshold for 

Converts a power to a pvalue, under the assumption that the power is chi2 

The pvalue of the test statistic max(lambda_i) where there are numpowers 

The BenjaminiHockberg quasithreshold for finding the statistically significant powers in 

Returns a set of frequencies to create spectra for, for the input data. These frequencies are 

Calculates the Fourier frequencies associated with a timestep and a total number of times. These frequencies 
Calculates the Fourier frequencies from a set of times. These frequencies are in 1/units, where 


Finds the amplitudes in the data at the specified frequency indices. 

Returns the Hoyer sparsity index of the input vector p. This is defined to be: 

Takes an arbitrary input vector p and maps it to a vector bounded within [0,1]. 

Transforms the input float x as: 

Implements a lowpass filter on the input array, by DCTing the input, mapping all but the lowest 

Implements a moving average on sequence with an averaging width of width. 

Generates a minimal sparsity probability trajectory for the specified power, and a specified number of modes 

Generates a probability trajectory with the specified power that is approximately Gaussian distribution 
Attributes¶
 pygsti.extras.drift.signal._LombScargle¶
 pygsti.extras.drift.signal.spectrum(x, times=None, null_hypothesis=None, counts=1, frequencies='auto', transform='dct', returnfrequencies=True)¶
Generates a power spectrum from the input timeseries data. Before converting to a power spectrum, x is rescaled as
x  > (x  counts * null_hypothesis) / sqrt(counts * null_hypothesis * (1null_hypothesis)),
where the arithmetic is elementwise, and null_hypothesis is a vector in (0,1). If null_hypothesis is None it is set to the mean of x. If that mean is 0 or 1 then the power spectrum returned is (0,1,1,1,…).
 Parameters
x (array) – The timeseries data to convert into a power spectrum
times (array, optional) – The times associated with the data in x. This is not optional for the lsp transform
null_hypothesis (None or array, optional) – Used to normalize the data, and should be the null hypothesis that is being tested for the probability trajectory from which x is drawn. If null_hypothesis is None it is set to the mean of x.
counts (int, optional) – The number of counts per timestep, whereby all values of x are within [0,counts]. In the main usages for drift detection, x is the clickstream for a single measurement outcome – so x contains integers between 0 and the number of measurements at a (perhaps coarsegrained) time. counts is this number of measurements per time.
frequencies ('auto' or array, optional) – The frequencies to generate the power spectrum for. Only relevant for transform=`lsp`.
transform ('dct', 'dft' or 'lsp', optional) – The transform to use to generate power spectrum. ‘dct’ is the TypeII discrete cosine transform with an orthogonal normalization; ‘dft’ is the discrete Fourier transform with a unitary normalization; ‘lsp’ is the floatmeaning LombScargle periodogram with an orthogonallike normalization.
returnfrequencies (bool, optional) – Whether to return the frequencies corrsponding to the powers
 Returns
if returnfrequencies –
 array or None
The frequencies corresponding to the power spectrum. None is returned if the frequencies cannot be ascertained (when times is not specified).
array or None – The amplitudes, that are squared to obtain the powers. None is returned when the transform does not generate amplitudes (this is the case for lsp)
array – The power spectrum
 pygsti.extras.drift.signal.standardizer(x, null_hypothesis=None, counts=1)¶
Maps the vector x over [0, counts] as
x  > (x  counts * null_hypothesis) / sqrt(counts * null_hypothesis * (1null_hypothesis)),
where the arithmetic is elementwise, and null_hypothesis is a vector in (0,1). If null_hypothesis is None it is set to the mean of x. If that mean is 0 or 1 then None is returned.
 pygsti.extras.drift.signal.unstandardizer(z, null_hypothesis, counts=1)¶
Inverts the standardizer function.
 pygsti.extras.drift.signal.dct(x, null_hypothesis=None, counts=1)¶
Returns the TypeII discrete cosine transform of y, with an orthogonal normalization, where
y = (x  counts * null_hypothesis) / sqrt(counts * null_hypothesis * (1null_hypothesis)),
where the arithmetic is elementwise, and null_hypothesis is a vector in (0,1). If null_hypothesis is None it is set to the mean of x. If that mean is 0 or 1 then the vector of all ones, except for the first element which is set to zero, is returned.
 Parameters
x (array) – Data string, on which the normalization and discrete cosine transformation is performed. If counts is not specified, this must be a bit string.
null_hypothesis (array, optional) – If not None, an array to use in the normalization before the dct. If None, it is taken to be an array in which every element is the mean of x.
counts (int, optional) – A factor in the normalization, that should correspond to the countspertimestep (so for full time resolution this is 1).
 Returns
array – The DCT modes described above.
 pygsti.extras.drift.signal.idct(modes, null_hypothesis, counts=1)¶
Inverts the dct function.
 Parameters
modes (array) – The fourier modes to be transformed to the time domain.
null_hypothesis (array) –
 The array that was used in the normalization before the dct. This is
commonly the mean of the timedomain data vector. All elements of this array must be in (0,1).
 countsint, optional
A factor in the normalization, that should correspond to the countspertimestep (so for full time resolution this is 1).
 Returns
array – Inverse of the dct function
 pygsti.extras.drift.signal.dft(x, null_hypothesis=None, counts=1)¶
Returns the discrete Fourier transform of y, with a unitary normalization, where y is an array with elements related to the x array by
y = (x  counts * null_hypothesis) / sqrt(counts * null_hypothesis * (1null_hypothesis)),
where the arithmetic is elementwise, and null_hypothesis is a vector in (0,1). If null_hypothesis is None it is set to the mean of x. If that mean is 0 or 1 then the vector of all ones, except for the first element which is set to zero, is returned.
 Parameters
x (array) – Data string, on which the normalization and discrete cosine transformation is performed. If counts is not specified, this must be a bit string.
null_hypothesis (array, optional) – If not None, an array to use in the normalization before the dct. If None, it is taken to be an array in which every element is the mean of x.
counts (int, optional) – A factor in the normalization, that should correspond to the countspertimestep (so for full time resolution this is 1).
 Returns
array – The DFT modes described above.
 pygsti.extras.drift.signal.idft(modes, null_hypothesis, counts=1)¶
Inverts the dft function.
 Parameters
modes (array) – The fourier modes to be transformed to the time domain.
null_hypothesis (array) –
 The array that was used in the normalization before the dct. This is
commonly the mean of the timedomain data vector. All elements of this array must be in (0,1).
 countsint, optional
A factor in the normalization, that should correspond to the countspertimestep (so for full time resolution this is 1).
 Returns
array – Inverse of the dft function
 pygsti.extras.drift.signal.lsp(x, times, frequencies='auto', null_hypothesis=None, counts=1)¶
Performs a LombScargle periodogram (lsp) on the input data, returning powers and frequencies.
* This function uses astropy, which is not a required dependency for pyGSTi *
 Parameters
todo –
 Returns
todo
 pygsti.extras.drift.signal.bartlett_spectrum(x, numspectra, counts=1, null_hypothesis=None, transform='dct')¶
Calculates the Bartlett power spectrum. This involves splitting the data into disjoint sections of the same length, and generating a power spectrum for each such section, and then averaging all these power spectra.
 Parameters
x (array) – The data to calculate the spectrum for.
numspectra (int) – The number of “chunks” to split the data into, with a spectra calculated for each chunk. Note that if len(x) / num_spectra is not an integer, then not all of the data will be used.
counts (int, optional) – The number of “clicks” per timestep in x, used for standarizing the data.
null_hypothesis (array, optional) – The null hypothesis that we’re looking for violations of. If left as None then this is the nodrift null hypothesis, with the static probability set to the mean of the data.
transform (str, optional) – The transform to use the generate the power spectra.
 Returns
array – The Bartlett power spectrum
 pygsti.extras.drift.signal.dct_basisfunction(omega, times, starttime, timedif)¶
The omega`th DCT basis function, for a initial time of `starttime and a timestep of timedif, evaluated at the times times.
 pygsti.extras.drift.signal.power_significance_threshold(significance, numtests, dof)¶
The multitest adjusted signficance statistical significance threshold for testing numtests test statistics that all have a marginal distribution that is chi2 with dof degrees of freedom.
 pygsti.extras.drift.signal.power_to_pvalue(power, dof)¶
Converts a power to a pvalue, under the assumption that the power is chi2 distribution with dof degrees of freedom.
 pygsti.extras.drift.signal.maxpower_pvalue(maxpower, numpowers, dof)¶
The pvalue of the test statistic max(lambda_i) where there are numpowers lambda_i test statistics, and they are i.i.d. as chi2 with dof degrees of freedom. This approximates the pvalue of the largest power in “clickstream” power spectrum (generated from spectrum), with the DOF given by the number of clicks per times.
 pygsti.extras.drift.signal.power_significance_quasithreshold(significance, numstats, dof, procedure='BenjaminiHochberg')¶
The BenjaminiHockberg quasithreshold for finding the statistically significant powers in a power spectrum.
 pygsti.extras.drift.signal.compute_auto_frequencies(ds, transform='dct')¶
Returns a set of frequencies to create spectra for, for the input data. These frequencies are in units of 1 / unit where unit is the units of the timestamps in the DataSet. What this function is doing has a fundmentally different meaning depending on whether the transform is timestamp aware (here, the LSP) or not (here, the DCT and DFT).
Timestamp aware transforms take the frequencies to calculate powers at as an input, so this chooses these frequencies, which are, explicitly, the frequencies associated with the powers. The task of choosing the frequencies amounts to picking the best set of frequencies at which to interogate the true probability trajectory for components. As there are complex factors involved in this choice that the code has no way of knowing, sometimes it is best to choose them yourself. E.g., if different frequencies are used for different circuits it isn’t possible to (meaningfully) averaging power spectra across circuits, but this might be preferable if the timestep is sufficiently different between different circuits – it depends on your aims.
For timestamp unaware transforms, these are the frequencies that, given that we’re implementing the, e.g., DCT, the generated power spectrum is implicitly with respect to. In the case of data on a fixed timegrid, i.e., equally spaced data, then there is a precise set of frequencies implicit in the transform. Otherwise, these frequencies are explicitly at least slightly ad hoc, and choosing these frequencies amounts to choosing those frequencies that “best” approximate the properties being interogatted with fitting each, e.g., DCT basis function to the (timestampfree) data.
 Parameters
ds (DataSet or MultiDataset) – Contains timeseries data that the “auto” frequencies are calculated for.
transform (str, optional) – The type of transform that the frequencies are being chosen for.
 Returns
frequencies (list) – A list of lists, where each of the consituent lists is a set of frequencies that are the are the derived frequencies for one or more of the dataset rows (i.e., for one or circuits). It is common for this to be a length1 list, containing a single set of frequencies that are the frequencies this function has chosen for every circuit.
pointers (dict) – A dictionary that “points” from the index of a circuit (indexed in the order of ds.keys()) to the index of the list of frequencies for that circuit. No entry in the dictionary for an index should be interpretted as 0. So this can be an empty dictionary (with the frequencies list then necessarily of length 1, containing a single set of frequencies for every circuit).
 pygsti.extras.drift.signal.frequencies_from_timestep(timestep, numtimes)¶
Calculates the Fourier frequencies associated with a timestep and a total number of times. These frequencies are in 1/units, where units is the units of time in times.
 Parameters
timestep (float) – The time difference between each data point.
numtimes (int) – The total number of times.
 Returns
array – The frequencies associated with Fourier analysis on data with these timestamps.
 pygsti.extras.drift.signal.fourier_frequencies_from_times(times)¶
Calculates the Fourier frequencies from a set of times. These frequencies are in 1/units, where units is the units of time in times. Note that if the times are not exactly equally spaced, then the Fourier frequencies are illdefined, and this returns the frequencies based on assuming that the timestep is the mean timestep. This is reasonable for small deviations from equally spaced times, but not otherwise.
 Parameters
times (list) – The times from which to calculate the frequencies
 Returns
array – The frequencies associated with Fourier analysis on data with these timestamps.
 pygsti.extras.drift.signal.amplitudes_at_frequencies(freq_indices, timeseries, times=None, transform='dct')¶
Finds the amplitudes in the data at the specified frequency indices. Todo: better docstring. Currently only works for the DCT.
 pygsti.extras.drift.signal.sparsity(p)¶
Returns the Hoyer sparsity index of the input vector p. This is defined to be:
HoyerIndex = (sqrt(l)  (p_1 / p_2)) / (sqrt(l)  1)
where l is the length of the vector and ._1 and ._2 are the 1norm and 2norm of the vector, resp.
 pygsti.extras.drift.signal.renormalizer(p, method='logistic')¶
Takes an arbitrary input vector p and maps it to a vector bounded within [0,1].
If method = “sharp”, then it maps any value in p below zero to zero and any value in p above one to one.
If method = “logistic” then it ‘squashes’ the vector around it’s mean value using a logistic function. The exact transform for each element x of p is:
x > mean  nu + (2*nu) / (1 + exp(2 * (x  mean) / nu))
where mean is the mean value of p, and nu is min(1mean,mean). This transformation is only sensible when the mean of p is within [0,1].
 Parameters
p (array of floats) – The vector with values to be mapped to [0,1]
method ({'logistic', 'sharp'}) – The method for “squashing” the input vector.
 Returns
numpy.array – The input vector “squashed” using the specified method.
 pygsti.extras.drift.signal.logistic_transform(x, mean)¶
Transforms the input float x as:
x > mean  nu + (2*nu) / (1 + exp(2 * (x  mean) / nu))
where nu is min(1mean,mean). This is a form of logistic transform, and maps x to a value in [0,1].
 pygsti.extras.drift.signal.lowpass_filter(data, max_freq=None)¶
Implements a lowpass filter on the input array, by DCTing the input, mapping all but the lowest max_freq modes to zero, and then inverting the transform.
 Parameters
data (numpy.array,) – The vector to lowpass filter
max_freq (None or int, optional) – The highest frequency to keep. If None then it keeps the minimum of 50 or l/10 frequencies, where l is the length of the data vector
 Returns
numpy.array – The lowpassfiltered data.
 pygsti.extras.drift.signal.moving_average(sequence, width=100)¶
Implements a moving average on sequence with an averaging width of width.
 pygsti.extras.drift.signal.generate_flat_signal(power, nummodes, n, candidatefreqs=None, base=0.5, method='sharp')¶
Generates a minimal sparsity probability trajectory for the specified power, and a specified number of modes containing nonzero power. This is a probability trajectory that has a signal power of power (where, for a probability trajectory p, the signal vector is s = p  mean(p)), and has that power equally spread over nummodes different frequencies. The phases on the amplitudes are randomized.
 Parameters
power (float) – The amount of power in the signal of the generated probability trajectory
nummodes (int) – The number of modes to equally spread that power over.
n (int) – The number of sample times that the probability trajectory is being created for.
candidatefreqs (list, optional) – A list containing a subset of 1,2,…,n1. If not None, then all frequencies are included.
 basefloat in (0,1), optional
The mean of the generated probability trajectory.
 methodstr, optional
The method used to guarantee that the created probability trajectory is a valid probability, i.e., is within [0,1]. Options:
 None. In this case nothing is done to guarantee that the output is a valid probability
trajectory, but the power in the signal is guaranteed to be the input power.
 ‘sharp’ or ‘logistic’. The ‘sharp’ or ‘logistic’ methods with the renormalizer()
function are used to map the generated vector onto [0,1]. When the renormalizer does a nontrivial action on the generated vector, the power in the output will be below the requested power. Moreover, the power will not be distributed only over the requested number of modes (i.e., it impacts the spectrum of the output vector).
 Returns
array – A probability trajectory.
 pygsti.extras.drift.signal.generate_gaussian_signal(power, center, spread, n, base=0.5, method='sharp')¶
Generates a probability trajectory with the specified power that is approximately Gaussian distribution accross frequencies, centered in the frequency specified by center. The probability trajectory has a rescaled signal power of power, where, for a probability trajectory p, the signal vector is s = p  mean(p), and the rescaled signal vector is p  mean(p) / sqrt(mean(p) * (1  mean(p)). The phases on the amplitudes are randomized.
 Parameters
power (float) – The amount of power in the rescaled signal of the generated probability trajectory.
center (int) – The mode on which to center the Gaussian.
spread (int) – The spread of the Gaussian
n (int) – The number of sample times that the probability trajectory is being created for.
base (float in (0,1), optional) – The mean of the generated probability trajectory.
method (str, optional) –
The method used to guarantee that the created probability trajectory is a valid probability, i.e., is within [0,1]. Options:
 None. In this case nothing is done to guarantee that the output is a valid probability
trajectory, but the power in the signal is guaranteed to be the input power.
 ’sharp’ or ‘logistic’. The ‘sharp’ or ‘logistic’ methods with the renormalizer()
function are used to map the generated vector onto [0,1]. When the renormalizer does a nontrivial action on the generated vector, the power in the output will be below the requested power. Moreover, the power will not be distributed only over the requested number of modes (i.e., it impacts the spectrum of the output vector).
 Returns
array – A probability trajectory.