SkillAgentSearch skills...

Neuroblastoma

Single-cell transcriptomics and epigenomics unravel the role of monocytes in neuroblastoma bone marrow metastasis

Install / Use

/learn @csbg/Neuroblastoma

README

Single-cell transcriptomics and epigenomics unravel the role of monocytes in neuroblastoma bone marrow metastasis

DOI

This code supplements the publication by Fetahu, Esser-Skala, Dnyansagar et al (2023).

Folders

(Not all of these folders appear in the git repository.)

  • data_generated: output files generated by the scripts in this repository
  • data_raw: raw input data
  • doc: project documentation
  • literature: relevant publications
  • metadata: additional required data
  • misc: miscellaneous scripts
  • plots: generated plots
  • renv: R environment data
  • scatac: scripts for scATAC-seq analysis
  • tables: exported supplementary tables and data; the subfolder source_data contains source data for figures

Download data

Create a folder data_raw that will contain raw data in the following subfolders:

  • adrmed:
    • adrenal_medulla_Seurat.RDS: reference expression data for adrenal medullary cells; download from https://adrenal.kitz-heidelberg.de/developmental_programs_NB_viz/ (Download data -> Download Adrenal medulla data -> Seurat object (RDS))
  • rna_seq: Download GSE216155_RAW.tar from GEO Series GSE216155 and extract all files.
  • atac_seq: Download all files from GEO Series GSE216175 (GSE216175_barcodes.tsv.gz, GSE216175_barcodes_samples.csv.gz, GSE216175_filtered_peak_bc_matrix.h5, GSE216175_matrix.mtx.gz, GSE216175_peaks.bed.gz, and GSE216175_RAW.tar). Extract all files from the tarball.
  • GSE137804: download the following files from GEO series GSE137804:
    • GSE137804_tumor_dataset_annotation.csv.gz
    • GSE137804_RAW.tar, from which the following eleven files must be extracted:
      • GSM4088774_T10_gene_cell_exprs_table.xls.gz
      • GSM4088776_T27_gene_cell_exprs_table.xls.gz
      • GSM4088777_T34_gene_cell_exprs_table.xls.gz
      • GSM4088780_T69_gene_cell_exprs_table.xls.gz
      • GSM4088781_T71_gene_cell_exprs_table.xls.gz
      • GSM4088782_T75_gene_cell_exprs_table.xls.gz
      • GSM4088783_T92_gene_cell_exprs_table.xls.gz
      • GSM4654669_T162_gene_cell_exprs_table.xls.gz
      • GSM4654672_T200_gene_cell_exprs_table.xls.gz
      • GSM4654673_T214_gene_cell_exprs_table.xls.gz
      • GSM4654674_T230_gene_cell_exprs_table.xls.gz
  • snp_array: Extract the contents of snp_array.tgz provided in Zenodo repository https://doi.org/10.5281/zenodo.7707614

Optionally, obtain intermediary data: Extract the contents of R_data_generated.tgz from Zenodo repository https://doi.org/10.5281/zenodo.7707614 to folder data_generated.

scRNA-seq analysis

Main workflow

Run these R scripts in the given order to generate all files required by figures and tables.

Plotting functions

Run these R scripts in arbitrary order to generate publication figures and tables:

Other scripts

scATAC-seq analysis

All required scripts are in subfolder scatac.

scATAC-seq workflow

scATAC-seq scRNA-seq integration workflow

For data integration we used scGLUE (Graph Linked Unified Embedding) model for unpaired single-cell multi-omics data integration (https://scglue.readthedocs.io/en/latest/). We followed the detailed tutorial at https://scglue.readthedocs.io/en/latest/tutorials.html. Before the tutorial we needed to convert the objects in anndata format from SingleCellExperiment and Seurat for scRNA-seq and scATAC-seq respectively. There are many tools available to do this and we are sharing our approach for format conversion, namely monocle_to_anndata.R and Seurat_to_anndata.R.

The following Jupyter notebooks follow the notebooks of the scGLUE integration pipeline.

Finally, Figures.R generates publication figures.

Related Skills

View on GitHub
GitHub Stars11
CategoryDevelopment
Updated7mo ago
Forks2

Languages

Jupyter Notebook

Security Score

87/100

Audited on Aug 21, 2025

No findings