NetREm

Network regression embeddings reveal cell-type transcription factor coordination for gene regulation

By: Saniya Khullar, Xiang Huang, Raghu Ramesh, John Svaren, Daifeng Wang

Summary

NetREm is a software package that utilizes network-constrained regularization for biological applications and other network-based learning tasks. In biology, traditional regression methods can struggle with correlated predictors, particularly transcription factors (TFs) that regulate target genes (TGs) in gene regulatory networks (GRNs). NetREm incorporates information from prior biological networks to improve predictions and identify complex relationships among predictors (e.g. TF-TF coordination: direct/indirect interactions among TFs). This approach can highlight important nodes and edges in the network, reveal novel regularized embeddings for genes, provide insights into underlying biological processes, identify subnetworks of predictors that group together to influence the response variable, and improve model accuracy and biological/clinical significance of the models. NetREm can incorporate multiple types of network data, including Protein-Protein Interaction (PPI) networks, gene co-expression networks, and metabolic networks. In summary, network-constrained regularization may bolster the construction of more accurate and interpretable models that incorporate prior knowledge of the network structure among predictors.

Pipeline

Pipeline image of NetREm png

Hardware Requirements

The minimum requirement is a computer with 8 GB of RAM and 32 GB of storage. For large prior graph networks, 32 GB of RAM is recommended.

Software Requirements and Installation Guide

The software uses Python 3.10. After downloading the NetREm Github code, conda/Anaconda users can use the following steps to install:

In the Anaconda navigator prompt, create a virtual environment of Python 3.10 by running: conda create -n NetREm python=3.10
Activate the environment: conda activate NetREm
Make sure to change the current directory to the NetREm folder.
Install the packages and dependencies (math, matplotlib, networkx, numpy, typing, os, pandas, plotly.express, random, scipy, scikit-optimize, scikit-learn, sys, tqdm, warnings): pip install -r requirements.txt

Please note that if you encounter import errors from files or functions in the code folder (such as Netrem_model_builder.py), add an empty file named init.py to the code folder, and add the "code." prefix to all imports from the "code" folder. For example, import Netrem_model_builder as nm :arrow_right: import code.Netrem_model_builder as nm.

Usage of the NetREm main function netrem()

NetREm fits a Network-constrained Lasso regression machine learning model with user-provided weights for the prior network. Here, netrem is the main function with the following usage:

netrem( edge_list, beta_net = 1, alpha_lasso = 0.01, default_edge_weight = 0.01, edge_vals_for_d = True, w_transform_for_d = "none", degree_threshold = 0.5, gene_expression_nodes = [], overlapped_nodes_only = False, y_intercept = False, view_network = False, model_type = "Lasso", ... )

<!-- $$ = \begin{cases} \text{if cv_for_alpha_lasso_model_bool = } False & \text{default: user wants to specify the value of } \alpha_{lasso} \\ \text{if cv_for_alpha_lasso_model_bool = } True & \text{GRegulNet will perfo

NetREm

Install / Use

README

NetREm

Network regression embeddings reveal cell-type transcription factor coordination for gene regulation

By: Saniya Khullar, Xiang Huang, Raghu Ramesh, John Svaren, Daifeng Wang

Summary

Pipeline

Hardware Requirements

Software Requirements and Installation Guide

Usage of the NetREm main function netrem()