HYDRA

Section of Biomedical Image Analysis Department of Radiology University of Pennsylvania Richard Building 3700 Hamilton Walk, 7th Floor Philadelphia, PA 19104

Web: https://www.med.upenn.edu/sbia/ Email: sbia-software at uphs.upenn.edu

Author: Erdem Varol software@cbica.upenn.edu

===============

INTRODUCTION =============== This software performs clustering of heterogenous disease patterns within patient group. The clustering is based on seperating the patient imaging features from the control imaging features using a convex polytope classifier. Covariate correction can be performed optionally.

=============== 2. TESTING & INSTALLATION

This software has been primarily implemented in MATLAB for Linux operating systems.

Requirements

Matlab optimization toolbox
Matlab version >2014

Installation

Hydra can be run directly in a matlab environment without compilation.

OPTIONAL:

If the user wants to run hydra as a standalone executable, then it must be compiled as following (using the additionally obtained matlab compiler "mcc"):

Run the following command in a MATLAB environment:

mcc -m hydra.m

Test

We provided a test sample in the test folder.

To test in matlab enviroment, use the command:

hydra('-i','test.csv','-o','.','-z','test_covar.csv','-k',3,'-f',3)

To test in command line using the compiled executable, use the command:

hydra -i test.csv -o . -z test_covar.csv -k 3 -f 3

This runs a HYDRA experiment which may take a few minutes. The test case contains a subset of a functional MRI study dataset by T. Satterwaithe comprising 100 subjects and their functional ROI's. The output is the clustering labels of the input subjects (only patients are clustered) at varying clustering levels. Also, the clustering stability at varying levels is output to show the rationale for choosing the clustering level.

Test Verification

Pre-computed HYDRA results have been included in directory "Pre_computed_test_results". The user may verify that their test results match the pre-computed results to confirm proper set-up. If the clustering occurred properly, ARI for clustering level k=3 should be greater than that of clustering level k=2.

========== 3. USAGE

I. Running "HYDRA":

Here is a brief introduction to running HYDRA. For a complete list of parameters, see --help option.

To run this software, you will need an input csv file, with the following mandatory fields in the following column order: (Column 1) ID: ID for subject (Column 2---(last minus 1)) features: features to be used for clustering (Column (last)) groups: label whether the subject is control (-1) or patient (1)

NOTE: Controls must be strictly -1 and patients must be 1 label. NOTE: Label headers names are not strict.

An example input csv file looks as following:

ID, feature_1, feauture_2, feature_3, group subject_1, 5, 1, 79.3, -1 subject_2, 10, 1, 71.4, 1 subject_3, 3, 1, 82.7, -1

Optionally, you can provide a covariate file that will be used to remove covariate effects from imaging features before HYDRA analysis. The covariate file has the following format: (Column 1) ID: ID for subject (Column 2---(last)) covariates: covariates of subjects

An example covariate csv file looks as following:

ID, age, sex subject_1, 29, 1 subject_2, 35, 1 subject_3, 51, 0

If you install the package successfully, there will be two ways of running HYDRA:

Running HYDRA in a matlab environment, a simple example:

 hydra('-i','test.csv','-o','.','-z','test_covar.csv','-k',3,'-f',3)

Running matlab compiled HYDRA executables in the command line, a simple example:

hydra -i test.csv -o . -z test_covar.csv -k 3 -f 3

The software returns:

HYDRA_results.mat in the specified output directory.

This mat file stores the following variables

CIDX - clustering indices for subjects (rows) at varying levels (columns) ARI - adjusted rand index of clustering at varying levels, clustering level at the highest ARI should be selected ID - subject ID of rows

=========== 4. REFERENCE

If you find this software useful, please cite:

Varol, Erdem, Aristeidis Sotiras, Christos Davatzikos, and Alzheimer's Disease Neuroimaging Initiative. "HYDRA: Revealing heterogeneity of imaging and genetic patterns through a multiple max-margin discriminative analysis framework." NeuroImage 145 (2017): 346-364.

=========== 5. LICENSING

See https://www.med.upenn.edu/sbia/software-agreement.html or COPYING.txt file.