SkillAgentSearch skills...

Mumerge

muMerge tool for combining replicate bed-regions

Install / Use

/learn @Dowell-Lab/Mumerge
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

muMerge

muMerge is a tool for combining bed regions from multiple bed files that overlap

Installation

To install muMerge, you clone the github repository as follows:

$ git clone https://github.com/Dowell-Lab/mumerge.git

Requirements

muMerge is written in python version 3, and uses the following software and modules:

  • python v3
  • numpy
  • matplotlib
  • bedtools

Usage

Help command

For general usage, used the help command

$  python mumerge.py -h

This will return the general commands needed to run muMerge.

usage: mumerge.py [-h] [-H] [-i INPUT] [-o OUTPUT] [-w WIDTH] [-m MERGED] [-r] [-v]

Merges region calls (mu) generated by Tfit, or other peak calling functions across multiple samples and replicates.

optional arguments:
  -h, --help            show this help message and exit
  -H, --HELP            Verbose help info about the input format.
  -i INPUT, --input INPUT
                        Input file (full path) containing bedfiles, sample ID's and replicate grouping names (tab delimited). Each sample on separate line. First line header, equal to '#file<TAB>sampid<TAB>group',
                        required. 'file' must be full path. 'sampid' can be any string. 'group' can be string or integer. See '-H' help flag for more information.
  -o OUTPUT, --output OUTPUT
                        Output file basename (full path, sans extension). WARNING: will overwrite any existing file)
  -w WIDTH, --width WIDTH
                        The ratio of a the sigma for the corresponding probabilty distribution to the bed region (half-width) --- sigma:half-bed (default: 1???). The choice for this parameter will depend on the data
                        type as well as how bed regions were inferred from the expression data.
  -m MERGED, --merged MERGED
                        Sorted bedfile (full path) containing the regions over which to combine the sample bedfiles. If not specified, mumerge will generate one directly from the sample bedfiles.
  -r, --remove_singletons
                        Remove calls not present in more than 1 sample
  -v, --verbose         Verbose printing during processing.

Input files

The <INPUT> file is a tab delimited text file that contains paths to BED files to be merged along with sample names as condition/replicate information for each sample. In the example below, there are 4 samples with two treatment groups.

#file	sampid	group
/path/to/sample1.bed	sample1 control
/path/to/sample2.bed    sample2	control
/path/to/sample3.bed    sample3	treatment
/path/to/sample4.bed    sample4	treatment

Example run command

$ python mumerge.py -i sample_information.txt -o output_path/project_id

Output files

muMerge returns the merged regions in BED file format (project_id_MUMERGE.bed). Additionally, a log file (project_id.log) that details the summary of the run is also inlcuded along with intermediate files (project_id_MISCALLS.bed, project_id_BEDTOOLS_MERGE.bed).

Run time

The overall run time depends on the the number for input BED files and regions being merged. A test case, where 8 samples (~30,000 regions) with 6 condition groups were merged, took about 12 minutes on a MacBook Pro iCore i9 2.3 GHz running macOS v 10.14.6.

  • python version 3.6.3
  • numpy version 1.19.1
  • matplotlib version 3.2.2
  • bedtools version 2.30.0

Citation

Please cite the following article if you use muMerge:

@article{rubin2021transcription,
  title={Transcription factor enrichment analysis (TFEA) quantifies the activity of multiple transcription factors from a single experiment},
  author={Rubin, Jonathan D and Stanley, Jacob T and Sigauke, Rutendo F and Levandowski, Cecilia B and Maas, Zachary L and Westfall, Jessica and Taatjes, Dylan J and Dowell, Robin D},
  journal={Communications biology},
  volume={4},
  number={1},
  pages={1--15},
  year={2021},
  publisher={Nature Publishing Group}
}

View on GitHub
GitHub Stars5
CategoryDevelopment
Updated6mo ago
Forks1

Languages

Python

Security Score

77/100

Audited on Sep 5, 2025

No findings