Msmu
Python toolkit for LC-MS/MS Proteomics analysis based on MuData
Install / Use
/learn @bertis-informatics/MsmuREADME
Python toolkit for modular and traceable LC-MS/<u>MS</u> proteomics analysis based on <u>Mu</u>Data
Overview
msmu is an open-source Python package for modular and traceable post-DB search preprocessing and statistical analysis of bottom-up proteomics data.
It supports modules for every step of end-to-end processing—from search output parsing through hierarchical summarization, normalization, batch correction, statistical analysis, and visualization—implemented with commonly used analytical and statistical methods.
Central to msmu is the highly versatile and standardized MuData (and AnnData) as a unifying, provenance-aware data container for organizing and storing annotations and representations of multi-dimensional MS data and processing history.
This unique marriage between flexible processing pipeline and MuData empowers FAIR principle-aligned downstream analysis for biomarker discovery and systems biology.
Key Features
- Flexible data ingestion from Sage, DIA-NN, and other popular DB search tools
- MuData/AnnData-compatible object structure for organizing multi-level MS data
- Protein inference: infer protein groups from peptide evidence using parsimony rule
- Normalization: median centering, quantile normalization, etc.
- Batch correction for discrete and continuous variations
- Built-in QC: identification count, peptide length, charge, missed cleavage, intensity distribution, etc.
- Statistical analysis: differential expression analysis, dimensionality reduction
- PTM data support and stoichiometry adjustment with matched global dataset (if available)
- Visualization: PCA, UMAP, volcano plots, heatmaps, QC metrics
Supporting DB Search Tools
- Sage: https://sage-docs.vercel.app
- DIA-NN: https://github.com/vdemichev/DIA-NN
- MaxQuant: https://www.maxquant.org/
- FragPipe: https://fragpipe.nesvilab.org/
- and more upcoming.
Documentation
Comprehensive documentation, including installation instructions, tutorials, and API references, is available at: https://bertis-informatics.github.io/msmu/
Citation
If you use msmu in your research, please cite the following publication (preprint):
msmu: a Python toolkit for modular and traceable LC-MS proteomics data analysis based on MuData
Hyung-Wook Choi, Byeongchan Lee, Un-Beom Kang, Sunghyun Huh
bioRxiv 2026.01.07.698308; doi: 10.64898/2026.01.07.698308
License
BSD 3-Clause License. See LICENSE for details.
Related Skills
claude-opus-4-5-migration
85.3kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
342.5kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
TrendRadar
50.2k⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
mcp-for-beginners
15.7kThis open-source curriculum introduces the fundamentals of Model Context Protocol (MCP) through real-world, cross-language examples in .NET, Java, TypeScript, JavaScript, Rust and Python. Designed for developers, it focuses on practical techniques for building modular, scalable, and secure AI workflows from session setup to service orchestration.
