Multivariate neural population analyses (spynal.multi)¶
Multivariate analyses of neural population data – subspaces/manifolds, information geometry, etc.
Overview¶
Functionality for multivariate neural analyses, which typically operates at the level of populations of neural data channels, units, or voxels. Includes analyses for characterizing and comparing multivariate activity patterns that broadly overlap with what different people refer to as “subspace”, “manifold”, or “information/representational geometry” analyses.
Functionality currently includes: - Useful linear algebra utilities not included in numpy/scipy/standard libraries - Computing different measures of distance between population vector representations - Estimating dimensionality of neural population activity - Computing alignment/overlap between subspaces of population activity - Much more functionality is in progress, and will be added…
Most functions perform operations in a “mass-multivariate” manner. This means that rather than embedding function calls in for loops over channels, timepoints, etc., like this:
for condition in conditions:
for timepoint in timepoints:
results[timepoint,condition] = compute_something(data[timepoint,condition])
You can instead execute a single call on ALL the data, labeling the axis corresponding to distinct observations (eg trials, timepoints), and the axis corresponding to distinct multivariate features (eg neural channels/units/voxels) and it will run in parallel (vectorized) across all other data array dimensions (eg corresponding to experimental conditions, timepoints, etc.) like this:
results = compute_something(data, trial_axis, feature_axis)
Function list¶
Linear algebra utilities¶
project_vector : Compute projection of one vector onto another
orthogonalize_vector : Orthogonalize one vector with respect to another
orthogonalize_matrix : Generate orthonomal basis from input matrix of basis vectors
symmetric_orthogonalization : Lowdin symmetric orthogonalization for a matrix of basis vectors
gram_schmidt_orthogonalization : Gram-Schmidt orthogonalization for matrix of basis vectors
covariance_matrix : Compute covariance matrix of data matrix or stack of data matrixes
shrinkage_covariance : Covariance estimator with user-set shrinkage
Multivariate distance-type functions¶
vector_magnitude : Compute magnitude of one or more data vectors using given metric
vector_distance : Compute distance btwn one/more pairs of data vectors using given metric
vector_cosine : Compute cosine of angle between two vectors (or btwn arrays of stacked vectors)
Dimensionality estimation¶
dimensionality : High-level wrapper function for estimating dimensionality of data
pc_expvar_dim : Est. dimensionality as number of PCs to reach threshold explained variance
pc_noise_dim : Est. dimensionality as number of PCs > estimate of noise (cf Machens)
participation_ratio : Continuous measure of dimensionality based on PCA eigenvalue distribution
(IN PROGRESS) shatter_dim : Dim ~ number of implementable binary classifications (cf Rigotti)
Subspace evalutation/manipulation functions¶
subspace_reconstruction_var : Reconstruction-error-based subspace alignment
subspace_projection_var : Cross-projection-based subspace alignment
align_subspaces : Optimally align bases of two subspaces using principal angles method
Subspace overlap/alignment functions¶
subspace_reconstruction_index : Normalized subspace reconstruction error index (cf. Russo 2020)
subspace_projection_index : Normalized subspace cross-projection index (cf. Elsayed 2016)
subspace_error_index : 1 - normalized subspace-to-nullspace cross-projection index (Gokcen 2022)
subspace_principal_angles : Compute principal angles between subspaces (cf. Gallego 2018)
Cross-validation objects/functions¶
OddEvenSplit : scikit Splitter-like cv object implementing traditional odd/even trial splits
BalancedKFold : k-fold cross-validation with trials fully balanced btwn classes in each fold
DummyCrossValidator : scikit Splitter-like object implementing no-cv (same train/test data)
Function reference¶
- project_vector(v1, v2)¶
Compute projection of vector v1 onto vector v2 = proj_v2(v1)
- Parameters:
v1 (array-like, shape=(n_elems,)) – Vector to project onto v2
v2 (array-like, shape=(n_elems,)) – Vector to project v1 onto. Must be same length as v1.
- Returns:
v_proj – Orthogonal projection of v1 onto v2
- Return type:
ndarray, shape=(n_elems,)
- orthogonalize_vector(v1, v2)¶
Orthogonalize vector v1 with respect to vector v2
Subtract projection of v1 onto v2 from v1: oproj_v2(v1) = v1 - proj_v2(v1)
- Parameters:
v1 (array-like, shape=(n_elems,)) – Vector to orthogonalize with respet to v2
v2 (array-like, shape=(n_elems,)) – Vector to orthogonalize v1 with respect to. Must be same length as v1.
- Returns:
v_ortho – v1 orthogonalized with respect to v2
- Return type:
ndarray, shape=(n_elems,)
- orthogonalize_matrix(X, method='symmetric', rankerr=False, tol=None)¶
Orthogonalize a matrix of basis vectors using given method.
A key difference between methods is symmetric does not prioritize any input vectors over others, in that they are affected symmetrically. Other methods prioritize earlier (leftmost) vectors in input, in that later vectors are affected more.
Wrapper around method-specific functions (see those for details).
- Parameters:
X (ndarray, shape=(n_elements,n_vectors)) – Set of vectors to orthogonalize (arranged in a matrix, where the vectors are columns). For a design matrix, the shape would correspond to (n_observations,n_features).
method (str, default: 'symmetric') –
Method to use to orthogonalize basis:
- ’Gram-Schmidt’Iterative method that retains first vector in input, removes the
projection of subsequent vectors onto the first, and so on… Thus, it prioritizes earlier vectors in input. Uses
gram_schmidt_orthogonalization().
- ’QR’Decomposes matrix X = QR into orthonormal matrix Q (returned) and upper-triangular
matrix R (not returned here). Like Gram-Schmidt, it is asymmetric, prioritizing earlier columns. For full-rank matrices, QR and Gram-Schmidt are equivalent (but QR is much faster, and thus generally preferred). Uses
np.linalg.qr().
- ’symmetric’Non-iterative SVD-based method that finds closest (in squared error)
orthonormal basis to input basis, has symmetric effects on all basis vectors. Uses
symmetric_orthogonalization().
rankerr (bool, default: False) – If True, raises an error if X is not full rank. If False, just uses only the first <rank> singular vectors in orthogonalized matrix.
tol (float, default: (eps*largest eigenvalue)) – Threshold minimal eigenvalue below which eigenvalues are regarded as 0. By default, set = max(n_elements,n_vectors)*(largest eigenvalue of X)*(precision for datatype). This is the default tolerance used in
np.linalg.matrix_rank().
- Returns:
X_ortho – Set of orthonormal basis vectors. method = ‘symmetric’ or ‘QR’ returns a matrix same size as input. Gram-Schmidt returns matrix with shape[1] reduced to rank of input matrix (implicitly, the rest of the rank-n_features columns would be all 0’s).
- Return type:
ndarray, shape=(n_elements,n_features) or (n_elements,rank)
- gram_schmidt_orthogonalization(X, rankerr=False, tol=None)¶
Orthogonalize a matrix of basis vectors using Gram-Schmidt orthogonalization.
Takes as input a linearly independent set of vectors (the columns of X) and outputs an orthonormal set.
Note: This method (unlike
symmetric_orthogonalization()) prioritizes earlier vectors (columns from left to right) in input basis X. The first output vector is = first input vector. The rest are computed by removing their projections onto all previous vectors.- Parameters:
X (ndarray, shape=(n_elements,n_vectors)) – Set of vectors to orthogonalize (arranged in a matrix, where the vectors are columns). For a design matrix, the shape would correspond to (n_observations,n_features).
rankerr (bool, default: False) – If True, raises an error if X is not full rank. If False, just uses only the first <rank> singular vectors in orthogonalized matrix.
tol (float, default: (eps*largest eigenvalue)) – Threshold minimal eigenvalue below which eigenvalues are regarded as 0. By default, set = max(n_elements,n_vectors)*(largest eigenvalue of X)*(precision for datatype). This is the default tolerance used in
np.linalg.matrix_rank().
- Returns:
X_ortho – Set of orthonormal basis vectors.
- Return type:
ndarray, shape=(n_elements,rank)
- symmetric_orthogonalization(X, rankerr=False, tol=None)¶
Orthogonalize a matrix of basis vectors using Lowdin symmetric orthogonalization.
Like traditional orthogonalization methods (eg Gram-Schmidt/QR), this takes as input a linearly independent set of vectors (the columns of X) and outputs an orthonormal set.
Unlike traditional methods, this doesn’t prioritize any vectors in set – all are symmetrically affected – and the result is as close as possible to the original data matrix (in terms of Frobenius norm of difference).
The Lowdin orthogonalization of X is simply U*V.T, for U*S*V.T = svd(X).
- Parameters:
X (ndarray, shape=(n_elements,n_vectors)) – Set of vectors to orthogonalize (arranged in a matrix, where the vectors are columns). For a design matrix, the shape would correspond to (n_observations,n_features).
rankerr (bool, default: False) – If True, raises an error if X is not full rank. If False, just uses only the first <rank> singular vectors in orthogonalized matrix.
tol (float, default: (eps*largest eigenvalue)) – Threshold minimal eigenvalue below which eigenvalues are regarded as 0. By default, set = max(n_elements,n_vectors)*(largest eigenvalue of X)*(precision for datatype). This is the default tolerance used in
np.linalg.matrix_rank().
- Returns:
X_ortho – Set of orthonormal basis vectors.
- Return type:
ndarray, shape=(n_elements,n_vectors)
References
https://booksite.elsevier.com/9780444594365/downloads/16755_10030.pdf
- lowdin_orthogonalization(X, rankerr=False, tol=None)¶
Alias of
symmetric_orthogonalization(). See there for details.
- covariance_matrix(data, axis=-2, feature_axis=-1, method='unbiased', **kwargs)¶
Compute feature covariance matrix of data matrix or stack of data matrixes.
Provides unified interface to a number of different methods/functions for computing covariance, including the standard empirical estimate (
np.cov()), Ledoit-Wolf shrinkage (sklearn.covariance.ledoit_wolf()), and Oracle Approximating Shrinkage (sklearn.covariance.oas()).Shrinkage-based estimators (eg Ledoit-Wolf, OAS) can provide better (lower variance, but slightly biased) estimates of covariance when data is ill-conditioned by interpolating between the empirical measured covariance and a simpler form (eg diagonal matrix; see References).
- Parameters:
data (ndarray, shape=(n_obs,n_features) or (...,n_obs,...,n_features,...)) –
Data matrix(es) to compute covariance matrix of.
Can be given as either a single data matrix of shape=(n_obs,n_features) (eg where observations might correspond to trials or time points, and features might correspond to neural channels/units) or a stack of such data matrices (eg for different conditions). In the latter case, the observation and feature axes must be specified in axis and feature_axis, respectively, and the covariance will be computed separately for each data matrix, and returned in a single array.
axis (int, default: -2 (2nd to last axis)) – Axis of data corresponding to distinct observations (eg trials, timepoints, conditions) to compute covariance across.
feature_axis (int, default: -2 (2nd to last axis)) – Axis of data corresponding to distinct features (eg neural channels/units) to compute covariance between.
method (string or callable, default: 'unbiased') –
Method to use to compute covariance:
’unbiased’ : Empirical covariance unbiased by n, using
np.cov()with bias=False’MLE’ : Maximum Likelihood Est of cov (biased by n), using
np.cov()with bias=True- ’shrinkage’Performs “shrinkage” (regularization) of covariance toward
simpler-structured version, with user-set level of shrinkage/regularization, using
shrinkage_covariance()
- ’ledoit_wolf’Ledoit-Wolf method, which finds optimal shrinkage that minimizes
Mean Squared Error btwn estimated and actual covariance, using
sklearn.covariance.ledoit_wolf()
- ’OAS’Oracle Approximating Shrinkage. Slightly improved regularization method.
Uses
sklearn.covariance.oas().
**kwargs – Any additional arguments passed as-is to covariance computing function
- Returns:
cov – Feature covariance matrix(es) of data.
For single data matrix, returns single cov matrix of shape (n_features,n_features).
For stack of data matrices, returns cov of all data matrices in a single array with same shape as input data, but with observation axis now having length=n_features.
- Return type:
ndarray, shape=(n_features,n_features) or (…,n_features,…,n_features,…)
References
- shrinkage_covariance(data, target='diagonal', param=0.1)¶
Estimate feature covariance matrix of data matrix or stack of data matrixes using shrinkage estimator with user-set shrinkage value.
Shrinkage-based estimators can provide better estimates of covariance when data is ill-conditioned by interpolating between the empirical measured covariance and a simpler form (eg diagonal matrix). To set optimal shrinkage given your data matrix instead of hand-set shrinkage, use
sklearn.covariance.ledoit_wolf()orsklearn.covariance.oas().Note: This function is not set up to compute covariance in a “mass-multivariate” fashion. For that, use the wrapper function
covariance_matrix()with method = ‘shrinkage.- Parameters:
data (ndarray, shape=(n_obs,n_features)) – Data matrix to compute feature covariance matrix of.
target ({'diagonal','scalar','identity'}, default: 'diagonal') –
Target simpler form of covariance matrix to interpolate with empirical covariance estimate. Options:
’diagonal’ : Matrix w/ sample variances on diagonal, zeros everywhere else
’scalar’ : Identity matrix scaled by average sample variance
’identity’ : Identity matrix (1’s on diagonal, 0’s everywhere else)
param (float, default: 0.1) – Shrinkage parameter determining how much of target covariance is combined with empirical covariance estimate. Range is 0-1 (0 is no shrinking, returns empirical covariance; 1 is maximal shrinkage, returns covariance target).
- Returns:
cov – Feature covariance matrix of data, with given shrinkage
- Return type:
ndarray, shape=(n_features,n_features)
- vector_magnitude(data, axis=-1, method='euclidean', data_test=None, cov_inv=None, keepdims=True)¶
Compute vector magnitude (length/distance from origin) of one or more vectors in data array, using given distance metric.
Can optionally cross-validated (or cross-condition) magnitude metric using both data and data_test. Unlike uncrossed metrics, these are unbiased (have expected value of 0).
- Parameters:
data (ndarray, shape=(n_features,) or (...,n_features,...)) – Data array to compute magnitude of. Should be single data vector (eg activity across a set of neural channels/units), or a set of stacked vectors for multiple data series (eg timepoints, conditions, etc.). Shape is arbitrary, but axis must correspond to array dimension in which vector elements (features) are laid out along (used to compute vector magnitude), while all other dimensions are treated as separate vectors and analyzed independently.
axis (int, default:-1 (last dimension of data)) – Axis of data array to perform analysis on, in which each vector’s elements (features) are laid out along. Usually corresponds to distinct neural data channels/units. Analysis is performed independently along all other array axes.
method (str, default: 'euclidean') –
Distance metric to compute magnitudes under. For cross-validated metrics (if a value for data_test is given), only squared metrics {‘sqeuclidean’,’sqmahalanobis’} allowed (otherwise you’d have complex results). Options:
’euclidean’ : Euclidean distance (aka L2-norm)
’sqeuclidean’ : Squared Euclidean distance
- ’mahalanobisMahalanobis distance from origin – distance relative to covariance
(like a multivariate z-score). Must input cov_inv.
’sqmahalanobis’ : Squared Mahalanobis distance. Must input cov_inv.
data_test (ndarray, shape=(n_features,) or(...,n_features,...)) – To compute cross-validated (or cross-condition) magnitude btwn distinct “training” and “testing” data, input a distinct data array here. Shape must be same as data.
cov_inv (ndarray, shape=(...,n_features,n_features,...)) – Inverse covariance matrix(es) corresponding to vectors in data. Required for method = ‘[sq]mahalanobis’, but ignored otherwise. Shape must be same as data, but with additional axis of length n_features inserted immediately after axis (together, these two axes correspond to the covariance matrixes).
keepdims (bool, default: True) – If True, axis axis is reduced to length-one axes, but retained in output. If False, axis axis is removed from output.
- Returns:
magnitude – Magnitude of each vector in data (or crossed magnitude of data & data_test).
For a single vector, returned as a scalar float. For set of multiple vectors, returned as array same size as data, but with axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
float or ndarray, shape=(…,[1,]…)
- vector_distance(data, data2, axis=-1, method='euclidean', data_test=None, data2_test=None, cov_inv=None, keepdims=True)¶
Compute distance between of one or more pairs of vectors in data and data2 arrays, using given distance metric.
Can optionally cross-validated (or cross-condition) distance metric using both data and data_test. Unlike uncrossed metrics, these are unbiased (have expected value of 0).
- Parameters:
data (ndarray, shape=(n_features,) or (...,n_features,...)) –
Paired data arrays to compute distance between. Should each be a single data vector (eg activity across a set of neural channels/units), or a set of stacked vectors for multiple data series (eg timepoints, conditions, etc.).
Shape is arbitrary, but axis must correspond to array dimension in which vector elements (features) are laid out along (used to compute distance), while all other dimensions are treated as separate vectors and analyzed independently.
data2 (ndarray, shape=(n_features,) or (...,n_features,...)) –
Paired data arrays to compute distance between. Should each be a single data vector (eg activity across a set of neural channels/units), or a set of stacked vectors for multiple data series (eg timepoints, conditions, etc.).
Shape is arbitrary, but axis must correspond to array dimension in which vector elements (features) are laid out along (used to compute distance), while all other dimensions are treated as separate vectors and analyzed independently.
axis (int, default:-1 (last dimension of data)) – Axis of data array to perform analysis on, in which each vector’s elements (features) are laid out along. Usually corresponds to distinct neural data channels/units. Analysis is performed independently along all other array axes.
method (str, default: 'euclidean') –
Distance metric to compute distances under. For cross-validated metrics (if a value for data_test is given), only squared metrics {‘sqeuclidean’,’sqmahalanobis’} allowed (otherwise you’d have complex results). Options:
’euclidean’ : Euclidean distance (aka L2-norm)
’sqeuclidean’ : Squared Euclidean distance
- ’mahalanobisMahalanobis distance from origin – distance relative to covariance
(like a multivariate z-score). Must input cov_inv.
’sqmahalanobis’ : Squared Mahalanobis distance. Must input cov_inv.
’cosine’ : Cosine of angle between feaure vectors
’correlation’ : Pearson correlation between feature vectors
’spearman’ : Spearman rank correlation between feature vectors
data_test (ndarray, shape=(n_features,) or (...,n_features,...)) – To compute cross-validated (or cross-condition) magnitude btwn distinct “training” and “testing” data, input a distinct data array here. Shape must be same as data.
data2_test (ndarray, shape=(n_features,) or (...,n_features,...)) – To compute cross-validated (or cross-condition) magnitude btwn distinct “training” and “testing” data, input a distinct data array here. Shape must be same as data.
cov_inv (ndarray, shape=(...,n_features,n_features,...)) – Inverse covariance matrix(es) corresponding to vectors in data. Required for method = ‘[sq]mahalanobis’, but ignored otherwise. Shape must be same as data, but with additional axis of length n_features inserted immediately after axis (together, these two axes correspond to the covariance matrixes).
keepdims (bool, default: True) – If True, axis axis is reduced to length-one axes, but retained in output. If False, axis axis is removed from output.
- Returns:
distance – Distance between each vector pair in data vs data2 (or crossed distance between data/data_test vs data_test/data2_test).
For a single pair of vectors, returned as a scalar float. For set of multiple vector pairs, returned as array same size as data, but with axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
float or ndarray, shape=(…,[1,]…)
- vector_cosine(x1, x2, axis=0, keepdims=True)¶
Compute cosine of angle between two vectors. Can also compute mass-multivariate cosine between two paired sets of stacked vectors.
To compute angle itself do: np.arccos(vector_cosine(x1,x2)) (in radians), or in degrees, np.rad2deg(np.arccos(vector_cosine(x1,x2)))
- Parameters:
x1 (array-like, shape=(n_elems,) or (...,n_elems,...)) – Two vectors to compute cosine of angle between, or two array of stacked vectors Can be any arbitrary length, but must both be same along axis.
x2 (array-like, shape=(n_elems,) or (...,n_elems,...)) – Two vectors to compute cosine of angle between, or two array of stacked vectors Can be any arbitrary length, but must both be same along axis.
axis (int, default: 0 (1st axis)) – For stacked-vector arrays, this gives the array axis to treat as separate vectors, and compute cosine along. Not used for single vector inputs.
keepdims (bool, default: True) – For stacked-vector arrays, if True, the length-1 reduced vector axis is retained in the output; if False, axis is removed.
- Returns:
cosine – Cosine of angle between vectors. Ranges from [-1,+1]. Returned as a single float for vector inputs. For stacked-vector arrays, returned as an array with same shape as inputs, but with axis reduced to length 1 (keepdims=True) or removed (keepdims=False).
- Return type:
float or ndarray, shape=(…[,1],…)
- dimensionality(data, labels=None, method='pc_noise', **kwargs)¶
High-level interface for computing dimensionality of data with respect to a given set of task/behavioral conditions/variables
NOTE: Unlike most other Spynal functions, this is currently only set up to accept and compute dimensionality for a single data matrix (ie, it doesn’t yet compute in a “mass-multivariate” fashion – TODO!!!)
- Parameters:
data (ndarray, shape=(n_obs,n_features)) – Data to estimate dimensionality of. “Observations” usually correspond to trials (and/or timepoints), and “features” usually corresponds to different neural channels/units.
labels (array-like, shape=(n_obs,), default: None) –
Condition labels for each observation (trial) in data. Must be discrete-valued.
Some dimensionality methods require a labels argument, and measure a supervised “coding dimensionality” (ie of the data condition means). For others, this argument is optional, and if it is omitted, they measured an unsupervised overall data dimensionality (ie of the raw data, including noise). See method and specific functions for details.
method (str, default: 'pc_noise') –
Method to use to compute dimensionality. Estimate dimensionality as:
- ’pc_expvar’Number of PCA eigenvalues > given threshold explained variance,
using
pc_expvar_dim()
- ’pc_noise’The number of data “signal” PCA eigenvalues > estimated noise
eigenvalues, cf. Machens 2010, using
pc_noise_dim()
- ’participation_ratio’Continuous measure of dimensionality based on distribution
of PCA eigenvalues, cf. Gao 2017, using
participation_ratio()
- ’shatter’The number of implementable binary classifications of data
(aka “shatter dimension”) cf. Rigotti 2013, using
shatter_dim()NOTE: This functionality is currently in progress, and not working just yet (TODO)
**kwargs – Any other keyword args passed directly to specific function for computing dimensionality
- Returns:
dimensionality – Estimate of data dimensionality
- Return type:
float
References
Machens, Romo, Brody (2010) J Neurosci https://doi.org/10.1523/JNEUROSCI.3276-09.2010
Rigotti…Miller,Fusi (2013) Nature https://doi.org/10.1038/nature12160
Gao…Shenoy, Ganguli (2017) https://doi.org/10.1101/214262
- participation_ratio(data, labels=None, **kwargs)¶
Estimate dimensionality of data using the “participation ratio” of the eigenspectrum of the covariance matrix, from Gao 2017. This is the ratio of the square of the sum of all eigenvalues, divided by the sum of the squared eigenvalues:
dim = sum(lambda)**2 / sum(lambda**2)
where lambda’s are the eigenvalues of the covariance matrix, and the sum is over all of them
This results in a measure of dimensionality that ranges continuously from 1 (all variance concentrated in a single eigenvalue, ie 1 dimension) to the max number of dimensions = min(n_observations,n_features) (all eigenvalues equal).
This can be used to estimate either the unsupervised overall data dimensionality (ie of the raw data, including noise; if labels is None) or the supervised “coding dimensionality” (of the condition means; if values given for labels).
- Parameters:
data (ndarray, shape=(n_obs,n_features)) – Data to estimate dimensionality of. “Observations” usually correspond to trials (and/or timepoints), and “features” usually corresponds to different neural channels/units.
(optional) (labels) –
Condition labels for each observation (trial) in data. Must be discrete-valued.
If a value us input for labels, function will compute mean of data within each condition (unique label value) before computing dimensionality, and thus returning an estimate of the “coding dimensionality”.
Leave labels = None to compute the dimensionality of the raw data (or if input data already reflects condition means).
**kwargs – Any additional keywords arguments passed to
covariance_matrix(), which is used to compute data covariance (and defaults tonp.cov())
- Returns:
dimensionality – Estimate of data dimensionality, ranging continuously from 1 - min(n_obs,n_features)
- Return type:
float
References
Gao…Shenoy, Ganguli (2017) https://doi.org/10.1101/214262
- pc_expvar_dim(data, labels=None, cutoff=0.95, **kwargs)¶
Estimate dimensionality of data as the first k principal components to retain a given cumulative level of explained variance. This results in a discrete measure of dimensionality ranging from 1 to the max number of data dimensions = min(n_observations,n_features).
This can be used to estimate either the unsupervised overall data dimensionality (ie of the raw data, including noise; if labels is None) or the supervised “coding dimensionality” (of the condition means; if values given for labels).
- Parameters:
data (ndarray, shape=(n_obs,n_features)) – Data to estimate dimensionality of. “Observations” usually correspond to trials (and/or timepoints), and “features” usually corresponds to different neural channels/units.
(optional) (labels) –
Condition labels for each observation (trial) in data. Must be discrete-valued.
If a value us input for labels, function will compute mean of data within each condition (unique label value) before computing dimensionality, and thus returning an estimate of the “coding dimensionality”.
Leave labels = None to compute the dimensionality of the raw data (or if input data already reflects condition means).
cutoff (float, default: 0.95) – Threshold proportion of explained variance to use to compute dimensionality. Dimensionality is the minimum number of PCs needed to explain >= cutoff variance.
**kwargs – Any additional keywords arguments passed to
covariance_matrix(), which is used to compute data covariance (and defaults tonp.cov())
- Returns:
dimensionality – Estimate of data dimensionality, ranging discretely from 1 - min(n_obs,n_features)
- Return type:
float
References
Cunningham & Yu (2014) Nat Neurosci https://doi.org/10.1038/nn.3776
- pc_noise_dim(data, labels, noise_method='sd', cutoff=None, which_eigs='all', n_noise_resmps=100)¶
Estimate dimensionality of data as the number of eigenvalues (as from a Principal Components Analysis) in full data which can be attributed to data “signal”, as opposed to “noise”, which is estimated by resampling random pairs of within-condition trials, cf. Machens 2010.
This method is only applicable to the supervised “coding dimensionality”; it is not defined for the full dimensionality of raw data. It returns a discrete measure of dimensionality ranging from 1 to the max number of data dimensions = min(n_observations,n_features)-1.
- Parameters:
data (ndarray, shape=(n_obs,n_features)) – Data to estimate dimensionality of. “Observations” usually correspond to trials (and/or timepoints), and “features” usually corresponds to different neural channels/units.
labels (array-like, shape=(n_obs,)) – Condition labels for each trial in data. Must be discrete-valued, and (unlike some dimensionality methods) must be input
noise_method ({'expvar','quantile','sd'}, default: 'sd') –
Determines specific method to use to compare signal to noise eigenvalues, in order to estimate dimensionality:
- ’expvar’Find first signal eigenvalue where > cutoff proportion of total signal
variance (sum of all signal eigenvalues) is explained. This is the original version used in Machens 2010. Our testing with synthetic data suggests it underestimates when dimensionality is high relative to number of conditions.
- ’quantile’Find first signal eigenvalue that drops below cutoff quantile of
resampled noise eigenvalue distribution. For this and ‘sd’, comparisons can be done against the noise distribution of only the first eigenvalue or each signal eigenvalue can be compared against the corresponding-order noise eigenvalue, depending on value of which_eigs. This version (with which_eigs = ‘first) was used in Supplemental results of Rigotti 2013.
- ’sd’Find first signal eigenvalue that drops below mean + cutoff std dev’s
of resampled noise eigenvalue distribution. In our simulations, ‘quantile’ and ‘sd’ are less biased than the ‘expvar’ method, and ‘sd’/which_eigs = ‘all’ is slightly less biased and lower variance (which is why it’s the default).
cutoff (float, default: (depends on noise_method)) –
Criterion value to use for determining dimensionality. Interpretation and default value depends on value of noise_method:
- ’expvar’Proportion of signal variance that eigenvalues must cumulatively explain
(range: 0-1, default: 0.95)
- ’quantile’Quantile of noise eigenvalue distribution that signal must drop below
(range: 0-1, default: 0.95)
- ’sd’Multiple of SD of noise eigenvalue dist’n that signal must drop below
(range: 0-inf, default: 2)
which_eigs ({'first','all'}, default: 'all') –
Which noise eigenvalue(s) to compare observed signal eigenvalues to, to establish noise floor for dimensionality computation:
’first’ : Compare each signal eigenvalue to distn of first (largest) noise eigenvalue
’all’ : Compare each signal eigenvalue to distn of corresponding-rank noise eigenvalue.
Note that this parameter is only used for noise_method = ‘quantile’ or ‘sd’; for ‘expvar’, all eigenvalues are always used.
n_noise_resmps (int, default: 1000) – Number of times to randomly resample noise to generate distribution of noise eigenvalues
- Returns:
dimensionality – Estimate of data dimensionality, ranging discretely from 1 - min(n_obs,n_features)
- Return type:
int
References
Machens, Romo, Brody (2010) J Neurosci https://doi.org/10.1523/JNEUROSCI.3276-09.2010
Rigotti…Miller,Fusi (2013) Nature https://doi.org/10.1038/nature12160
Notes
This function is based on Matlab code graciously provided by Mattia Rigotti. Go cite his paper^
- subspace_reconstruction_var(data, basis, feature_axis=-1, sum_axis=None, expansion_basis=None, keepdims=True)¶
Compute the proportion of total variance in data captured by a subspace (given by its basis/eigenvector matrix), quantified as the sum of squared error of the subspace basis’s reconstruction of the data, normalized by the sum of squares of the data itself:
expvar = 1 - ||X - X * W * W.T||^2 / ||X||^2
where X is the original data, W is the subspace basis, and ||.|| is the L2 norm
Note that the basis could be estimated from the data itself, or from a different set of data (eg different timepoint, experimental condition), in which case this would reflect the cross-variance captured.
This is used in the “subspace overlap” index of Russo 2020, which computes the cross-variance captured in one “reference” set of data by a basis fit to another “comparison set of data, normalized by the “self-variance” captured in the reference data by a basis fit to itself. This index is implemented in
subspace_reconstruction_index().Function allows mass-multivariate computation of captured variance for multiple data vectors (eg different trials/conditions, time points, etc.), but only for a single subspace basis.
- Parameters:
data (ndarray, shape=(n_features,) or (...,n_features,...)) – Data to compute captured variance of. Can be single data vector (typically corresponding to a population of neural channels/units) or stack of such vectors in an array (eg for different trials, timepoints, etc.), with the array axis corresponding to different vector features (eg channels) given in feature_axis. In this case, variance captured is computed separately for each data vector and returned in a similiarly-shaped array.
basis (ndarray, shape=(n_features,n_components)) – Basis matrix for subspace, which is used to reduce data dimensionality down to n_components. Column vectors correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column correspond to same feature in data (eg neural channels/units). For simplicity, only a single basis is allowed as input.
feature_axis (int, default: -1 (last axis)) – Axis of data corresponding to different features, to compute captured variance along. Typically corresponds to different neural channels/units.
sum_axis (int, default: None (no other axis)) –
Optional additional axis to compute captured variance along, in addition to feature_axis (ie using Frobenius matrix norm along two dims instead of standard L2/Euclidean norm along only feature_axis).
For index used in Russo 2020, this would correspond to the time axis of the data array; for trial-based data, this might correspond to its trial axis.
expansion_basis (ndarray, shape=(n_components,n_features), default: basis.T) –
Basis matrix for re-expansion of dimensionality-reduced data back to original data dimensionality. Only needed for dimensionality reduction methods (eg dPCA) that estimate distinct bases for dimensionality reduction and re-expansion (the “decoder” and “encoder” axes, respectively, of Kobak et al 2016).
Otherwise, this defaults to the transpose of the subspace basis (as is the case for PCA, LDA, and most standard dimensionality reduction methods).
keepdims (bool, default: False) – If True, retains reduced feature_axis (and sum_axis) as length-one axis(s) in output. If False, removes reduced feature_axis (and sum_axis) from outputs.
- Returns:
expvar – Proportion of variance in data captured by subspace, based on its reconstruction of the original data. Ranges from 0 (no variance captured) to 1 (all variance captured).
Returns a float for single vector input. For stacked vectors, returns array the same shape as data but with feature_axis (and sum_axis if set) reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
float or ndarray, shape=(…,[1,]…)
References
Gallego … Miller (2018) Nature Comms https://doi.org/10.1038/s41467-018-06560-z
Russo … M.Churchland (2020) Neuron https://doi.org/10.1016/j.neuron.2020.05.020
Kobak … Machens (2016) eLife https://doi.org/10.7554/eLife.10989 (dPCA)
- subspace_projection_var(cov, basis, feature_axis=(-2, -1), keep_components=False, normalization='pev', keepdims=True)¶
Compute the proportion of total variance in data (given by its covariance matrix) captured by a subspace (given by its basis/eigenvector matrix), by computing the (trace of) the projection of the covariance matrix onto the basis:
expvar = trace(W.T * C * W) / norm
where C is the data covariance matrix, W is the subspace basis, and norm is the normalizaiton constant (which is determined by normalization); if keep_components is True, trace() is replaced by diag()
Note that the basis could be estimated from the data itself, or from a different set of data (eg different timepoint, experimental condition), in which case this would reflect the cross-variance captured.
Output can be returned as raw captured data variance (in units of data^2), normalized by the number of features (ie channels/neurons, in units of data^2 per feature, cf. Murray 2017), or normalized by total data variance (so reflects proportion of variance captured, range 0-1), depending on value set for normalization. Can also compute variance captured by each basis vector (rather than total for full basis) using keep_components = True.
This measure is taken from Murray 2017, who used the ‘feature’ normalization version (expressing captured variance per neuron). It’s also used to compute the numerator (and in our formulation, also the denominator) of the “subspace alignment” index of Elsayed 2016 (cf.
subspace_projection_index()).Function allows mass-multivariate computation of captured variance for multiple data vectors (eg different trials/conditions, time points, etc.), but only for a single subspace basis.
- Parameters:
cov (ndarray, shape=(n_features,n_features) or (...,n_features,n_features,...)) –
Cross-feature covariance matrix of data to compute captured variance of.
Can be single covariance matrix (typically corresponding to covariance of a population of neural channels/units) or stack of such matrixes (eg for different trials, timepoints, etc.), with the pair of array axes corresponding to different vector features (eg channels/neurons) given in feature_axis.
basis (ndarray, shape=(n_features,n_components)) – Basis matrix for subspace, which is used to reduce data dimensionality down to n_components. Column vectors correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column correspond to same feature in data (eg neural channels/units). For simplicity, only a single basis is allowed as input.
feature_axis (array-like, shape=(2,), default: (-2,-1) (last two array axes)) – Axes of cov corresponding to different data vector features. Must have 2 values. This will typically correspond to covariance between different neural channels/neurons.
keep_components (bool, default: False) – If True, returns variance captured by each basis component vector (eg PC) separately. If False, return total (summed) variance captured by all basis vectors.
normalization ({'none','feature','pev'}, default: 'pev') –
How to normalize returned captured variance:
’none’ : Return raw, unormalized captured variance (eg in (spikes/s)^2)
- ’feature’Normalize by number of vector features (ie channels/neurons),
so reflects variance per channel/neuron (eg in (spikes/s)^2 per neuron)
- ’pev’Normalize by total variance, so reflects proportion of total data
variance captured (range 0-1)
keepdims (bool, default: True) – If True, retains reduced feature_axis as length-one axes in output. If False, removes reduced feature_axis from outputs.
- Returns:
expvar – Data variance captured by given basis, either raw variance or proportion, depending on value set for normalization. If keep_components is True, returns variance captured by each basis component vector; if False, returns variance summed across all components.
- Return type:
float or ndarray, shape=(…,[1,1,],…) or (…,[n_components,n_components,],…)
If keep_components is False, returns a float for single matrix input. For stacked matrices, returns array the same shape as data but with feature_axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False). If keep_components is True, returns array the same shape as data.
References
Murray … Wang (2017) PNAS https://doi.org/10.1073/pnas.1619449114
Elsayed … Cunningham (2016) Nature Comms https://doi.org/10.1038/ncomms13239
Yoo & Hayden (2020) Neuron https://doi.org/10.1016/j.neuron.2019.11.013
- todo Should we retain covariance,basis args here or make it data,basis and compute cov internally?
Would align args with reconstruction func and simplify the 2-axis feature_axis arg here
- align_subspaces(basis1, basis2, feature_axis=-2, component_axis=-1, return_cosines=False, keepdims=True, _compute_uv=True)¶
Optimally align bases of two subspaces using Bjorck & Golub principal angles method.
Computes new bases that optimally align subspaces, essentially by computing the canonical correlation between them (SVD of their matrix product).
Also optionally returns the cosines of the angles between the aligned subspaces, in ranked order (decreasing cosines/increasing angles).
Note that if you are only interested in the principal angles/cosines themselves, you should use
subspace_principal_angles()instead.- Parameters:
basis1 (ndarray, shape=(n_features,n_components1) or (...,n_features,...,n_components1,...)) –
Basis matrix for first subspace to align. Column vectors (along component_axis) correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column typically correspond to different neural channels/units.
Can be single basis matrix or stack of such matrices in an array (eg for different experimental conditions, timepoints, etc.), with the array axis corresponding to different vector features (eg channels) given in feature_axis.
- basis2ndarray, shape=(n_features,n_components2) or (…,n_features,…,n_components2,…)
Basis for second subspace to align. Often, this is derived from some other condition (eg timepoint, exp condition) that we are aligning/comparing the basis1 condition to.
May have different number of components (basis vectors), but must otherwise have same shape as basis1.
- feature_axisint, default: -2 (second-to-last axis)
Axis of basis1 and basis2 corresponding to different features. This will typically correspond to different neural channels/units.
- component_axisint, default: -1 (last axis)
Axis of basis1 and basis2 corresponding to different basis/eigenvectors of each basis. This corresponds to different components derived from a dimensionality reduction analysis.
- return_cosinesbool, default: False
If True, returns third output = cosines of principal angles btwn subspace. If False, just returns aligned bases.
- keepdimsbool, default: True
If True, retains reduced feature_axis as length-one axes in output. If False, removes reduced feature_axis from outputs.
- Returns:
basis1 (ndarray, shape=(n_features,n_components1) or (…,n_features,…,n_components1,…)) – Optimally aligned basis for first subspace. Same shape as input.
basis2 (ndarray, shape=(n_features,n_components2) or (…,n_features,…,n_components2,…)) – Optimally aligned basis for second subspace. Same shape as input.
cosines (ndarray, shape=(k,) or (…,[1,]…,k,…), optional) – Series of angle cosines between each pair of basis vectors in compared subspaces. Length of returned angles is the minimum of the number of vectors/components in basis1 and basis2: k = min(n_components1,n_components2)
Returns a 1D array for single matrix inputs. For stacked matrices, returns array the same shape as basis1/2 but with length of component_axis = min(n_components1,n_components2) and feature_axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
References
Bjorck & Golub (1973) Math Comp http://dx.doi.org/10.2307/2005662 (original method)
- subspace_reconstruction_index(data_comp, basis_ref, basis_comp, feature_axis=-1, sum_axis=None, expansion_basis_comp=None, expansion_basis_ref=None, keepdims=True)¶
Compute index of similarity betweeen two subspaces estimated from distinct “reference” and “comparison” sets of data, quantified as the proportion of variance in the comparison data captured by cross-projecting it onto the reference subspace, normalized by the proportion of comparison data variance captured by projecting it onto a subspace estimated from itself:
index = (1 - ||X_comp - X_comp * W_ref * W_ref.T||^2 / ||X_comp||^2) / (1 - ||X_comp - X_comp * W_comp * W_comp.T||^2 / ||X_comp||^2)
where X’s are the data, W’s are the subspace bases, and ||.|| is the L2 norm
This index could be used to compare subspaces between, for example, different timepoints or experimental condition. Note that the reference and comparison subspaces need not have the dimensionality (number of components).
The index ranges from 0 (no alignment) to 1 (perfect alignment). Note that inputs are treated asymmetrically. Roughly, “reference” and “comparison” here correspond to “training” and “testing” conditions for decoding (classification, regression) analysis. In cross-temporal decoding terms, this index would correspond to normalizing the cross-temporal matrix by its columns (normalizing each test time by its own decoder fit).
Function allows mass-multivariate computation of index for multiple data vectors (eg different trials/conditions, time points, etc.), but only for a single subspace basis.
This is the “subspace overlap” index of Russo 2020. Note: In Russo’s formulation, sums of squares are Frobenius (matrix) norms across features (channels/neurons) AND time. By default, this function only computes L2 (vector) norms across features (neurons). To match Russo, set sum_axis = <time axis> (or to sum across trials and neurons, set sum_axis = <trial axis>). Also, according Russo 2020’s formula (p. e4, “Methods: Quantification and Statistical analysis : Subspace overlap”), they use the unsquared norm; we instead use the squared norm, to keep it analogous to the univariate formulas for Sums of Squares.
- Parameters:
data_comp (ndarray, shape=(n_features,) or (...,n_features,...)) –
“Comparison” condition data array, used to compute variance captured by cross-projection onto reference basis basis_ref derived from a different condition.
Can be single data vector (typically corresponding to a population of neural channels/units) or stack of such vectors (eg for different trials, timepoints, etc.), with the array axis corresponding to different vector features given in feature_axis.
basis_ref (ndarray, shape=(n_features,n_components_ref)) – Basis matrix for subspace estimated from “reference” condition, which is used to reduce data dimensionality down to n_component_ref. Column vectors correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column correspond to same feature in data (eg neural channels/units). For simplicity, only a single basis is allowed as input.
basis_comp (ndarray, shape=(n_features,n_components_comp)) – Basis matrix for subspace estimated from “comparison” condition subspace. Shape and interpretation otherwise same as basis_ref; doesn’t need to have same number of components (dimensionality) as basis_ref.
feature_axis (int, default: -1 (last axis)) – Axis of data corresponding to different features. This will typically correspond to different neural channels/units.
sum_axis (int, default: None (no other axis)) – Optional additional axis (in addition to feature_axis) to compute Sums of Squares, and thus explained variance along (ie using matrix/Frobenius norm along two dims instead of standard L2/Euclidean norm along feature_axis). In Russo 2020 formulation, this would correspond to the time axis; for trial-based data, this might correspond to the trial axis.
expansion_basis_comp (ndarray, shape=(n_components_comp,n_features), default: basis_comp.T) –
Basis matrix for re-expansion of dimensionality-reduced data back to original data dimensionality, for comparison subspace. Only needed for dimensionality reduction methods (eg dPCA) that estimate distinct bases for dimensionality reduction and re-expansion (the “decoder” and “encoder” axes, respectively, of Kobak et al 2016).
Otherwise, this defaults to the transpose of the subspace basis (as is the case for PCA, LDA, and most standard dimensionality reduction methods).
expansion_basis_ref (ndarray, shape=(n_components_ref,n_features), default: basis_ref.T) – As above, but for reference subspace.
keepdims (bool, default: False) – If True, retains reduced feature_axis as length-one axis in output. If False, removes reduced feature_axis from outputs.
- Returns:
overlap – Subspace overlap index, measuring overlap btwn reference and comparison subspaces.
Returns a float for single vector input. For stacked vectors, returns array the same shape as data but with feature_axis (and sum_axis if set) reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
ndarray, shape=float or (…,[1,]…)
References
Russo … M.Churchland (2020) Neuron https://doi.org/10.1016/j.neuron.2020.05.020
- subspace_projection_index(cov_comp, basis_ref, basis_comp, feature_axis=(-2, -1), keep_components=False, keepdims=True, **kwargs)¶
Compute index of similarity betweeen two subspaces estimated from distinct “reference” and “comparison” sets of data, quantified as the magnitude of the cross-projection of the comparison data onto the reference subspace, normalized by the self-projection of the comparison subspace onto its own subspace:
index = trace(W_ref.T * C_comp * W_ref) / trace(W_comp.T * C_comp * W_comp)
where C’s are the data covariance matrices and W’s are the subspace bases
This index could be used to compare subspaces between, for example, different timepoints or experimental condition. Note that the reference and comparison subspaces need not have the dimensionality (number of components).
The index ranges from 0 (no alignment) to 1 (perfect alignment). Note that inputs are treated asymmetrically. Roughly, “reference” and “comparison” here correspond to “training” and “testing” conditions for decoding (classification, regression) analysis. In cross-temporal decoding terms, this index would correspond to normalizing the cross-temporal matrix by its columns (normalizing each test time by its own decoder fit).
Function allows mass-multivariate computation of index for multiple data vectors (eg different trials/conditions, time points, etc.), but only for single reference and comparison subspace bases.
This is (close to) the “subspace alignment” index of Elsayed 2016 and Yoo 2020. The difference is for the “self-variance” in the denominator, they simply compute the sum of the n_components eigenvalues. We instead use the same projection formula as in the numerator, because it allows you to compute an internally cross-validated basis_comp (ie arguably the proper way to make this comparison). It also simplifies the input arguments. If you don’t cross-validate, you should get the same result as the Elsayed formulation (but also you are biasing results in favor of the self-projection case / against the cross-projection case, as you are training and testing on the same “comparison” data, ie “double-dipping” or doing a “circular analysis”).
- Parameters:
cov_comp (ndarray, shape=(n_features,n_features) or (...,n_features,n_features,...)) –
Cross-feature (eg cross-channel/unit) covariance matrix of “comparison” condition data.
Can be single covariance matrix (typically corresponding to covariance of a population of neural channels/units) or stack of such matrixes (eg for different trials, timepoints, etc.), with the pair of array axes corresponding to different vector features (eg channels/neurons) given in feature_axis.
basis_ref (ndarray, shape=(n_features,n_components_ref)) – Basis matrix for subspace estimated from “reference” condition, which is used to reduce data dimensionality down to n_component_ref. Column vectors correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column correspond to same feature in data (eg neural channels/units). For simplicity, only a single basis is allowed as input.
basis_comp (ndarray, shape=(n_features,n_components_comp)) – Basis matrix for subspace estimated from “comparison” condition subspace. Shape and interpretation otherwise same as basis_ref. Only needs to have same number of components (dimensionality) as basis_ref if keep_components is True.
feature_axis (array-like, shape=(2,), default: (-2,-1) (last two array axes)) – Axes of cov corresponding to different data vector features. Must have 2 values. This will typically correspond to covariance between different neural channels/neurons.
keep_components (bool, default: False) – If True, returns variance explained by each basis component vector (eg PC) separately. If False, return total (summed) variance explained by all basis vectors.
keepdims (bool, default: True) – If True, retains reduced feature_axis as length-one axes in output. If False, removes reduced feature_axis from outputs.
**kwargs – All other keyword args passed directly to
subspace_projection_var()
- Returns:
alignment – Subspace alignment index, measuring alignment btwn reference and comparison subspaces.
Returns a float for single matrix input. For stacked matrixes, returns array the same shape as basis but with feature_axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False). If keep_components is True, returns array the same shape as data.
- Return type:
float or ndarray, shape=(…,[1,1,],…)
References
Elsayed … Cunningham (2016) Nature Comms https://doi.org/10.1038/ncomms13239
Yoo & Hayden (2020) Neuron https://doi.org/10.1016/j.neuron.2019.11.013 (use in cognition)
- subspace_error_index(basis_ref, basis_comp, feature_axis=-2, component_axis=-1, keepdims=True)¶
Compute index of similarity betweeen two subspaces estimated from distinct “reference” and “comparison” sets of data, quantified as 1 - the magnitude (L2 norm) of the cross-projection of the comparison subspace onto the nullspace of the reference subspace, normalized by the norm of the comparison subspace:
error = ||(I - W_ref * inv(W_ref.T * W_ref) * W_ref.T) * W_comp|| / ||W_comp|| index = 1 - error
where W’s are the subspace bases, and ||.|| is the L2 matrix (Frobenius) norm over the feature and component axes
This is 1 - the “normalized subspace error” index of Gokcen 2022. We compute 1 - their index, converting their measure of error into a measure of alignment. The resulting measure ranges from 0 to 1. A value of 0 indicates the column space of the reference subspace lies completely with the null space of the comparison subspace, and thus the comparison captures no component of the reference.
- Parameters:
basis_ref (ndarray, shape=(...,n_features,...,n_components_ref,...)) –
Basis matrix for subspace estimated from “reference” condition (ie which is used to reduce data dimensionality down to n_component_ref). Column vectors correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA.
Can be single basis matrix (by default, shape=(n_features,n_components_ref)) or stack of such matrixes (eg for different experimental conditions, timepoints, etc.), with the array axis corresponding to different features (eg channels/neurons) given in feature_axis and the axis corresponding to subspace basis components given in component_axis.
basis_comp (ndarray, shape=(...,n_features,...,n_components_comp,...)) – Basis matrix for subspace estimated from “comparison” condition subspace. Shape and interpretation otherwise same as basis_ref; doesn’t need to have same number of components (dimensionality) as basis_ref.
feature_axis (int, default: -2 (second-to-last axis)) – Axis of basis_ref and basis_comp corresponding to different features. This will typically correspond to different neural channels/units. By default, second-to-last axis of array of stacked matrices (1st axis of single matrix).
component_axis (int, default: -1 (last axis)) – Axis of basis_ref/comp corresponding to different subspace components/dimensions. By default, last axis of array of stacked matrices (2nd axis of single matrix).
keepdims (bool, default: True) – If True, retains reduced feature_axis and component_axis as length-one axis in output. If False, removes reduced feature_axis and component_axis from outputs.
**kwargs – All other keyword args passed directly to
subspace_projection_var()
- Returns:
overlap – Subspace overlap index, measuring overlap btwn reference and comparison subspaces.
Returns a float for single vector input. For stacked vectors, returns array the same shape as data but with feature_axis (and sum_axis if set) reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
ndarray, shape=float or (…,[1,]…)
References
Gokcen … Yu (2022) Nature CompSci https://doi.org/10.1038/s43588-022-00282-5 eqn. 9
- subspace_principal_angles(basis1, basis2, feature_axis=-2, component_axis=-1, output='radian', keepdims=True)¶
Compute principal angles between subspaces as a measure of subspace similarity/overlap.
These are the angles between subspaces, in increasing order of angle (decreasing order of their cosines), derived from a canonical correlation analysis btwn the two subspace bases (SVD of their matrix product). This essentially finds the optimal projection matrixes to for the two bases to minimize the angle between them; these minimized angles are returned in angles.
Note that, unlike other subspace similarity indexes, inputs here are treated symmetrically – principal_angle(basis1,basis2) = principal_angle(basis2,basis1). Also, unlike other indexes, this one is vector-valued for each subspace comparison, with one principal angle per subspace component (PC/eigenvector).
Function only returns principal angles. To also get aligned subspaces bases that minimize principal angles (optimally align subspaces), use
align_subspaces().Function allows mass-multivariate computation of index for multiple data vectors (eg different trials/conditions, time points, etc.), but only for single reference and comparison subspace bases.
Function allows mass-multivariate computation of principal angles for multiple subspaces (eg corresponding to different experimental conditions, time points, etc.).
This method was developed originally by Bjorck & Golub 1973, and first used for neuroscience applications (AFAIK) by Gallego 2018.
- Parameters:
basis1 (ndarray, shape=(n_features,n_components1) or (...,n_features,...,n_components1,...)) –
Basis matrix for first subspace to align. Column vectors (along component_axis) correspond to sequence of orthonormal basis vectors for subspace, eg the top-k eigenvectors from a PCA; features along each column typically correspond to different neural channels/units.
Can be single basis matrix or stack of such matrices in an array (eg for different experimental conditions, timepoints, etc.), with the array axis corresponding to different vector features (eg channels) given in feature_axis.
basis2 (ndarray, shape=(n_features,n_components2) or (...,n_features,...,n_components2,...)) –
Basis for second subspace to align. Often, this is derived from some other condition (eg timepoint, exp condition) that we are aligning/comparing the basis1 condition to.
May have different number of components (basis vectors), but must otherwise have same shape as basis1.
output ({'radian','degree','cosine'}, default: 'radian') – Type of “angle” to output: Angles in radians (range [0,pi]) or degrees (range [0,180]), or the angle cosines (range [-1,+1]).
feature_axis (int, default: -2 (second-to-last axis)) – Axis of basis1 and basis2 corresponding to different features. This will typically correspond to different neural channels/units.
component_axis (int, default: -1 (last axis)) – Axis of basis1 and basis2 corresponding to different basis/eigenvectors of each basis. This corresponds to different components derived from a dimensionality reduction analysis.
keepdims (bool, default: True) – If True, retains reduced feature_axis as length-one axes in output. If False, removes reduced feature_axis from outputs.
- Returns:
angles – Series of angles between each matched pair of components (basis vectors) in the compared subspaces. Length of returned angles is the minimum of the number of components betweeen basis1 and basis2.
Returns a 1D array for single matrix inputs. For stacked matrixes, returns array the same shape as basis1/2 but with length of component_axis = min(n_components1,n_components2) and feature_axis reduced to length 1 (if keepdims is True) or removed (if keepdims is False).
- Return type:
ndarray, shape=(k,) or (…,[1,]…,k,…) (k = min(n_components1,n_components2))
References
Bjorck & Golub (1973) Math Comp http://dx.doi.org/10.2307/2005662 (original method)
Gallego et al. (2018) Nature Comms https://doi.org/10.1038/s41467-018-06560-z (use in neuro)
Tang et al. (2020) eLife https://doi.org/10.7554/eLife.58154 (use in PFC)
- class OddEvenSplit(*args, **kwargs)¶
Implements traditional odd/even trial split-half cross-validation.
Depending on value of cross parameter, either implements single train/test split or two cross-validated split where odd and even swap train/test roles.
Inherits from sklearn.model_selection.BaseCrossValidator, but is needed b/c this type of cv scheme is not implemented in sklearn.
- Parameters:
cross (bool, default: False) – If False, generates a single split with odd trials for train and even trials for test. If True, generates two splits, one as above, and one with train/test roles reversed.
args (Any) –
kwargs (Any) –
- Return type:
Any
- split(X, y=None)¶
Generate indices to split data into training and test set.
- Xndarray, shape=(n_obs, n_features)
Data array. Only used to get number of observations (trials).
Any other parameters are ignored, just there to match sklearn API.
- Yields:
train_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_obs/2,)) – Training set indices for split. Here, all odd or all even trials
test_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_obs/2,)) – Test set indices for split. Here, all even or all odd trials.
- get_n_splits(X=None, y=None)¶
Return the number of splitting iterations in the cross-validator
- class BalancedKFold(*args, **kwargs)¶
Implements version of k-fold cross-validation with observations (trials) fully balanced between classes (conditions) within each fold.
sklearn.model_selection.StratifiedKFold()does something similar, but permits slightly unbalanced classes within each fold in order to balance the total number of observations across folds.BalancedKFold forces all classes to be balanced within each fold, but permits the overall number of observations to vary slightly across folds.
BalancedKFold should be used for applications where the number of observations (trials) must be exactly balanced across classes (conditions), such as dPCA.
NOTE: All classes/conditions must be balanced in the original (unsplit) data
Inherits from sklearn.model_selection.BaseCrossValidator, but is needed b/c this type of cv scheme is not implemented in sklearn.
- Parameters:
n_splits (int, default=5) – Number of folds. Must be at least 2.
shuffle (bool, default=True) – Whether to shuffle each class’s samples before splitting into folds. Note that if shuffle=False, folds will follow the order of trials in each condition (ie fold1 will be the first n trials, fold2 the next n, etc.)
random_state (int or None, default=None) – When shuffle is True, random_state affects the ordering of the indices, which controls the randomness of each fold for each class. Pass an int for reproducible output across multiple function calls. Otherwise unrepeatable random sequences will generated.
args (Any) –
kwargs (Any) –
- Return type:
Any
- split(X, y=None)¶
Generate indices to split data into training and test set.
- Xndarray, shape=(n_obs, n_features)
Data array. Not actually used here.
- yarray-like of shape (n_obs,)
The target variable for supervised learning problems (labels). Balancing is done based on the y labels.
- Yields:
train_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_train[split],)) – Training set indices for split.
test_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_test[split],)) – Test set indices for split.
- get_n_splits(X=None, y=None)¶
Return the number of splitting iterations in the cross-validator
- class DummyCrossValidator(*args, **kwargs)¶
Implements a “dummy” cross-validation object that mirrors sklearn.model_selection “Splitter” cross-validator objects, but just uses all data for both training AND testing (ie, does not actually implement any cross-validation).
Useful mainly just for simplifying code that allows cross-validation or not, so you don’t have to write separate code blocks for both cases.
- Parameters:
args (Any) –
kwargs (Any) –
- Return type:
Any
- split(X, y=None)¶
Generates indices to split data into “training” and “test” set (which are actually the same here)
- Parameters:
X (ndarray, shape=(n_obs, n_features)) – Data array. Only used to get number of observations (trials).
ignored (Any other parameters are) –
API. (just there to match sklearn) –
- Yields:
train_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_obs,)) – Training set indices for split. Here, all observations (trials).
test_idxs (generator, shape=(n_splits,) of ndarray, shape=(n_obs,)) – Test set indices for split. Here, all observations (trials).
- get_n_splits(X=None, y=None)¶
Return the number of splitting iterations in the cross-validator