Scitex Python
Modular Python toolkit for scientific research with ~300 MCP tools — from raw data to manuscript
Install / Use
/learn @ywatanabe1989/Scitex PythonQuality Score
Category
Development & EngineeringSupported Platforms
README
SciTeX (<code>scitex</code>)
<p align="center"> <a href="https://scitex.ai"> <img src="docs/assets/images/scitex-logo-blue-cropped.png" alt="SciTeX" width="400"> </a> </p> <p align="center"><b>Modular Python Toolkit for Scientific Research Automation</b></p> <p align="center"> <a href="https://badge.fury.io/py/scitex"><img src="https://badge.fury.io/py/scitex.svg" alt="PyPI version"></a> <a href="https://pypi.org/project/scitex/"><img src="https://img.shields.io/pypi/pyversions/scitex.svg" alt="Python Versions"></a> <a href="https://scitex-python.readthedocs.io"><img src="https://readthedocs.org/projects/scitex-python/badge/?version=latest" alt="Documentation"></a> <a href="https://github.com/ywatanabe1989/scitex-python/blob/main/LICENSE"><img src="https://img.shields.io/github/license/ywatanabe1989/scitex-python" alt="License"></a> </p> <p align="center"> <a href="https://scitex-python.readthedocs.io">Docs</a> · <a href="https://scitex-python.readthedocs.io/en/latest/quickstart.html">Quick Start</a> · <a href="https://scitex-python.readthedocs.io/en/latest/api/index.html">API</a> · <code>pip install scitex[all]</code> </p>This repository provides scitex, the orchestration layer of the SciTeX ecosystem — solving key problems in scientific research:
Problem and Solution
| # | Problem | Solution |
|---|---------|----------|
| 1 | Fragmented tools -- literature search, statistics, figures, and writing each require separate tools with incompatible formats | Unified toolkit -- import scitex as stx provides 50+ modules under one namespace, accessible via Python API, CLI, and MCP |
| 2 | No verification -- existing tools address whether work could be reproduced, not whether it has been verified | Cryptographic verification -- Clew builds SHA-256 hash-chain DAGs linking every manuscript claim back to source data |
| 3 | AI agents lack context -- general-purpose LLMs cannot operate across the full research lifecycle without domain-specific tools | 293 MCP tools -- AI agents run statistics, create figures, search literature, and compile manuscripts through structured tool calls |
| 4 | No custom tooling -- every lab needs domain-specific tools, but building and sharing them requires deep infrastructure knowledge | App Maker and Store -- researchers create custom apps with scitex-app SDK and share via SciTeX Cloud |
Research Workflow
<p align="center"> <img src="scripts/assets/workflow_out/workflow.png" alt="SciTeX Research Workflow" width="600"> </p> <p align="center"><sub><b>Figure 1.</b> SciTeX research pipeline -- from literature search to manuscript compilation, with every step cryptographically linked.</sub></p>Demo
40 min, minimal human intervention — an AI agent using SciTeX completed a full research cycle: literature search, statistical analysis, publication-ready figures, a 21-page manuscript, and peer review simulation.
<p align="center"> <a href="https://scitex.ai/demos/watch/scitex-automated-research/"> <img src="docs/assets/images/scitex-demo.gif" alt="SciTeX Demo" width="800"> </a> </p>Installation
pip install scitex[all] # Recommended: everything
<details>
<summary><strong>Per-module extras</strong></summary>
pip install scitex # Core only (minimal)
pip install scitex[plt,stats,scholar] # Typical research setup
pip install scitex[plt] # Publication-ready figures (figrecipe)
pip install scitex[stats] # Statistical testing (23+ tests)
pip install scitex[scholar] # Literature search, PDF download, BibTeX enrichment
pip install scitex[writer] # LaTeX manuscript compilation
pip install scitex[audio] # Text-to-speech
pip install scitex[ai] # LLM APIs (OpenAI, Anthropic, Google) + ML tools
pip install scitex[dataset] # Scientific datasets (DANDI, OpenNeuro, PhysioNet)
pip install scitex[browser] # Web automation (Playwright)
pip install scitex[capture] # Screenshot capture and monitoring
pip install scitex[cloud] # Cloud platform integration
Requires Python 3.10+. We recommend uv for fast installs.
</details> <details> <summary><strong>Module Overview</strong></summary>| Category | Modules | Description |
|----------|---------|-------------|
| Core | session, io, config, clew | Experiment tracking, file I/O, config, cryptographic verification |
| Analysis | stats, plt, dsp, linalg | Statistics, plotting, signal processing, linear algebra |
| Research | scholar, writer, diagram, canvas | Literature, manuscripts, diagrams, figure composition |
| ML/AI | ai, nn, torch, cv, benchmark | LLM APIs, neural networks, PyTorch, computer vision |
| Data | pd, db, dataset, schema | Pandas utilities, databases, scientific datasets |
| Infra | app, cloud, tunnel, container | App SDK, cloud, SSH tunnels, containers |
| Automation | browser, capture, audio, notification | Web automation, screenshots, TTS, notifications |
| Dev | dev, template, linter, introspect | Ecosystem tools, scaffolding, code analysis |
Quick Start
<details> <summary><strong><code>@scitex.session</code> -- Reproducible Experiment Tracking</strong></summary>One decorator gives you: auto-CLI, YAML config injection, random seed fixation, structured output, and logging.
import scitex as stx
import numpy as np
@stx.session
def main(
data_path: str = "./data.csv", # --data-path data.csv
n_samples: int = 100, # --n-samples 200
CONFIG=stx.session.INJECTED, # Aggregated ./config/*.yaml
plt=stx.session.INJECTED, # Pre-configured matplotlib
logger=stx.session.INJECTED, # Session logger
):
"""Analyze data. Docstring becomes --help text."""
# Load
data = stx.io.load(data_path)
# Demo data
x = np.linspace(0, 2 * np.pi, n_samples)
y = np.sin(x) + np.random.randn(n_samples) * 0.1
# FigRecipe Plot
fig, ax = stx.plt.subplots()
ax.plot(x, y)
ax.set_xyt("Time", "Amplitude", "Noisy Sine Wave")
# Save sine.png + sine.csv with logging message
stx.io.save(fig, "sine.png")
return 0
if __name__ == "__main__":
main()
$ python script.py --data-path experiment.csv --n-samples 200
$ python script.py --help
# usage: script.py [-h] [--data-path DATA_PATH] [--n-samples N_SAMPLES]
# Analyze data. Docstring becomes --help text.
script_out/FINISHED_SUCCESS/2026-03-18_14-30-00_Z5MR/
├── sine.png, sine.csv # Figure + auto-exported plot data
├── CONFIGS/CONFIG.yaml # Frozen parameters
└── logs/{stdout,stderr}.log # Execution logs
</details>
<details>
<summary><strong><code>scitex.clew</code> -- Cryptographic Verification for AI-Driven Science</strong></summary>
As AI agents produce research at scale, the question shifts from "could this be reproduced?" to "has this been verified?". Clew builds a SHA-256 hash-chain DAG linking every manuscript claim back to source data.
import scitex as stx
# Every stx.io.load/save automatically records file hashes -- zero config
stx.clew.status() # {'verified': 12, 'mismatched': 0, 'missing': 0}
stx.clew.chain("results/figure1.png") # Trace one file back to source data
stx.clew.dag(claims=True) # Verify all manuscript claims
# Register traceable assertions
stx.clew.add_claim(
file_path="paper/main.tex", claim_type="statistic", line_number=142,
claim_value="t(58) = 2.34, p = .021",
source_session="2026-03-18_14-30-00_Z5MR", source_file="results/stats.csv",
)
stx.clew.mermaid(claims=True) # Visualize provenance DAG
| Mode | Function | Answers |
|------|----------|---------|
| Project | clew.dag() | Is the whole project intact? |
| File | clew.chain("output.csv") | Can I trust this specific file? |
| Claim | clew.verify_claim("Fig 1") | Is this manuscript assertion valid? |
L1 hash comparison (ms) / L2 sandbox re-execution (min) / L3 registered timestamp proof (optional).
<p align="center"> <img src="docs/clew-dag.png" alt="Clew DAG" width="300"> </p> <p align="center"><sub><b>Figure 2.</b> Clew verification DAG -- green nodes are verified (hash match), red nodes have mismatches. Each node shows its SHA-256 hash prefix.</sub></p> </details> <details> <summary><strong><code>scitex.io</code> -- Unified File I/O (50+ Formats)</strong></summary>import scitex as stx
# Save and load -- format detected from extension
stx.io.save(df, "results.csv")
df = stx.io.load("results.csv")
stx.io.save(arr, "data.npy")
arr = stx.io.load("data.npy")
stx.io.save(fig, "figure.png") # Also exports figure data as CSV
stx.io.save(config, "config.yaml")
stx.io.save(model, "model.pkl")
# Aggregate ./config/*.yaml into a single DotDict
CONFIG = stx.io.load_configs(config_dir="./config")
print(CONFIG.MODEL.hidden_size) # Dot-notation access
# Register custom formats
@stx.io.register_saver(".custom")
def save_custom(obj, path, **kw):
with open(path, "w") as f:
f.write(str(obj))
@stx.io.register_loader(".custom")
def load_custom(path, **kw):
with open(path) as f:
return f.read()
Supports: CSV, JSON, YAML, TOML, HDF5, NPY, NPZ, PKL, PNG, JPG, SVG, PDF, Excel, Parquet, Zarr, INI, TXT, MAT, WAV, MP3, BibTeX, and more.
Built-in features: Auto directory creation, path resolution to <script_name>_out/, symlinks (symlink_from_cwd=True), save logging with
