:notebook_with_decorative_cover: Table of Contents

About the project
Getting started
FAQ
Known issues
Contributing
License
Citation
Contributions

:star2: About the project

frrsa is a Python package to conduct Feature-Reweighted Representational Similarity Analysis (FR-RSA). The classical approach of Representational Similarity Analysis (RSA) is to correlate two Representational Matrices, in which each cell gives a measure of how (dis-)similar two conditions are represented by a given system (e.g., the human brain or a model like a deep neural network (DNN)). However, this might underestimate the true correspondence between the systems' representational spaces, since it assumes that all features (e.g., fMRI voxel or DNN units) contribute equally to the establishment of condition-pair (dis-)similarity, and in turn, to correspondence between representational matrices. FR-RSA deploys regularized regression techniques (currently: L2-regularization) to maximize the fit between two representational matrices. The core idea behind FR-RSA is to recover a subspace of the predicting matrix that best fits to the target matrix. To do so, the matrices' cells of the target system are explained by a linear reweighted combination of the feature-specific (dis-)similarities of the respective conditions in the predicting system. Importantly, the Representational Matrix of each feature of the predicting system receives its own weight. This all is implemented in a nested cross-validation, which avoids overfitting on the level of (hyper-)parameters.

:rotating_light: Please also see the published article accompanying this repository. To use this package successfully, follow this README. :rotating_light:

:running: Getting started

:computer: Installing

The package is written in Python 3.8. Installation expects you to have a working conda on your system (e.g. via miniconda). If you have pip available already, you can skip the conda env create part.

Execute the following lines from a terminal to clone this repository and install it as a local package using pip.

cd [directory on your system where you want to download this repo to]
git clone https://github.com/ViCCo-Group/frrsa
conda env create --file=./frrsa/environment.yml
conda activate frrsa
cd frrsa
pip install -e .

:mag: How to use

There is only one user-facing function in frrsa. To use it, activate the conda environment, import and then call frrsa with your data:

from frrsa import frrsa

# load your "target" RDM or RSM.
# load your "predictor" data.
# set the necessary flags ("preprocess", "nonnegative", "measures", ...)

scores, predicted_matrix, betas, predictions = frrsa(target,
                                                     predictor,
                                                     preprocess,
                                                     nonnegative,
                                                     measures,
                                                     cv=[5, 10],
                                                     hyperparams=None,
                                                     score_type='pearson',
                                                     wanted=['predicted_matrix', 'betas', 'predictions'],
                                                     parallel='1',
                                                     random_state=None)

See frrsa/test.py for another simple demonstration.

:repeat: Parameters and returned objects

Parameters.

There are default values for all parameters, which we partly assessed (see our paper). However, you can input custom parameters as you wish. For an explanation of all parameters please see the docstring.

Returned objects.

scores: Holds the the representational correspondency scores between each target and the predictor. These scores can be sensibly used in downstream analyses.
predicted_matrix: The reweighted predicted representational matrix averaged across outer folds with shape (n_conditions, n_conditions, n_targets). The value 9999 denotes condition pairs for which no (dis-)similarity was predicted (why?). This matrix should only be used for visualizational purposes.
betas: Holds the weights for each target's measurement channel with the shape (n_conditions, n_targets). Note that the first weight for each target is not a channel-weight but an offset. These betas are currently computed suboptimally and should only be used for informational purposes. Do not use them to recreate the reweighted_matrix or to reweight something else (see #43).
predictions: Holds (dis-)similarities for the target and for the predictor, and to which condition pairs they belong, for all cross-validations and targets separately. This is a potentially very large object. Only request if you really need it. For an explanation of the columns see the docstring.

:question: FAQ

How does my data have to look like to use the FR-RSA package?

At present, the package expects data of two systems (e.g., a specific DNN layer and a brain region measured with fMRI) the representational spaces of which ought to be compared. The predicting system, that is, the one of which the feature-specific (dis-)similarities shall be reweighted, is expected to be a p x k numpy array. The target system contributes its full representational matrix in the form of a k x k numpy array (where p:=Number of measurement channels aka features and k:=Number of conditions see Diedrichsen & Kriegeskorte, 2017). There are no hard-coded upper limits on the size of each dimension; however, the bigger k and p become, the larger becomes the computational problem to solve. See Known issues for a lower limit of k.

You say that every feature gets its own weight - can those weights take on any value or are they restricted to be non-negative?

The function's parameter nonnegative can be set to either True or False and forces weights to be nonnegative (or not), accordingly.

What about the covariances / interactive effects between predicting features?

One may argue that it could be the case that the interaction of (dis-)similarities in two or more features in one system could help in the prediction of overall (dis-)similarity in another system. Currently, though, feature reweighting does not take into account these interaction terms (nor does classical RSA), which probably also is computationally too expensive for predicting systems with a lot of features (e.g. early DNN layers).

FR-RSA uses regularization. Which kinds of regularization regimes are implemented?

As of now, only L2-regularization aka Ridge Regression.

You say ridge regression; which hyperparameter space should I check?

If you set the parameter nonnegative to False, L2-regularization is implemented using Fractional Ridge Regression (FRR; Rokem & Kay, 2020). One advantage of FRR is that the hyperparameter to be optimized is the fraction between ordinary least squares and L2-regularized regression coefficients, which ranges between 0 and 1. Hence, FRR allows assessing the full range of possible regularization parameters. In the context of FR-RSA, twenty evenly spaced values between 0.1 and 1 are pre-set. If you want to specify custom regularization values that shall be assessed, you are able to do so by providing a list of candidate values as the hyperparams argument of the frrsa function. <br/> If you set the parameter nonnegative to True, L2-regularization is currently implemented using Scikit-Learn functions. They have the disadvantage that one has to define the hyperparameter space oneself, which can be tricky. If you do not provide hyerparameter candidates yourself, 14 pre-set values will b

Frrsa

Install / Use

README