ModelCypher

See what a model is doing below token level.

ModelCypher is a measurement and observability workbench for open-source model builders. It gives humans and frontier AI a clear way to inspect geometry, entropy, curvature, chain structure, and adapter-induced changes through workflow-first CLI surfaces instead of ad hoc activation scripts.

Current evidence state (2026-04-02): mc analyze is the clearest public entrypoint for prompt capture, prompt-family studies, and checkpoint or adapter comparison. mc train run remains shipped and geometry-derived, but the repo has not yet closed the promotable same-model same-data same-eval benchmark needed to claim "better than standard practice." See RESEARCH-ROADMAP.md.

The Thesis

A forward pass is a deterministic geometric map. The industry treats 15 training hyperparameters as knobs to tune — learning rate, rank, scale, warmup, clipping, schedule, decay, dropout, batch size, early stopping, target modules, weight init, epsilon, momentum, residual scaling. Every one of these has a closed-form geometric replacement derived from SVD, IEEE 754 machine precision, or a cited theorem. ModelCypher replaces all 15. See AGENTS.md for the full derivation philosophy.

Start By Measuring

poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze compare --left-model /path/to/base --right-model /path/to/base --right-adapter /path/to/adapter --manifest data/probes/prompt_family_minimal_pairs.json
poetry run mc analyze report --bundle /path/to/bundle
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>
poetry run python scripts/run_measurement_atlas.py --model /path/to/model --manifest data/probes/measurement_atlas_casing.json --manifest data/probes/measurement_atlas_profanity_tone.json --manifest data/probes/measurement_atlas_grounded_hallucination.json --output-root results/measurement_atlas

These commands emit an observation bundle under results/analysis/<timestamp-slug>/ by default:

manifest.json
summary.json
REPORT.md
variants.jsonl
layer_metrics.jsonl
comparisons.jsonl

The prompt-family interface is explicit in phase 1. Each row includes: case_id, variant_id, text, optional tags, and optional comparison_to.

For research-only generation tracing, the measurement atlas runner writes a family artifact under results/measurement_atlas/<run_id>/ with:

run_manifest.json
summary.json
REPORT.md
ledger.tsv
variants.jsonl
sequence_metrics.jsonl
step_metrics.jsonl
space_step_metrics.jsonl
comparisons.jsonl
onset_events.jsonl

The retained replay-alignment closure for the shipped 350M atlas pack lives in results/measurement_atlas/REPORT.md. Current observed atlas surfaces are replay={hidden, embedding} and live={hidden}; run_manifest.json now records requested vs observed surfaces separately so the bundle does not overclaim unsupported replay space coverage. mc analyze report --bundle ... can now read both the standard mc analyze bundles and these atlas artifact directories, while atlas generation itself remains research-only in scripts/run_measurement_atlas.py.

Train When You Want To Act On The Measurements

poetry run mc train run --model /path/to/model --data /path/to/dataset --output /path/to/adapter

No learning rate. No rank selection. No warmup schedule. No gradient clipping. The optimizer and step sizes are derived from measured geometry rather than copied recipes.

Need extra instrumentation? Use flags on the same command path, such as --benchmark, --topo-monitor, --dim-monitor, or --entropy-reg.

What Gets Derived

| # | What Industry Tunes | What ModelCypher Derives | Source | |---|---|---|---| | 1 | Learning rate (1e-4) | MASS spectral ceiling | Weyl 1912, Loizou 2020 | | 2 | Adam epsilon (1e-8) | Spectral noise floor | IEEE 754 + SVD | | 3 | Momentum (0.9/0.999) | Cayley-Stiefel retraction | Wen & Yin 2013, Wang 2025 | | 4 | Weight decay (0.01) | Condition ratio sigma_k / sigma_max | SVD | | 5 | Gradient clipping (1.0) | Removed — MASS bounds by construction | Weyl 1912 | | 6 | Warmup (5-10% steps) | Removed — geometric LR stable from step 0 | Ma & Yarats 2021 | | 7 | LR schedule (cosine) | Removed — MASS is per-step, no schedule needed | Defazio 2024 | | 8 | Batch size | Gradient noise scale B_crit | McCandlish 2018 | | 9 | Early stopping (patience) | 4 geometric criteria | SVD + IEEE 754 | | 10 | LoRA scale (alpha/rank) | Spectral bound sigma_k(W) / \|\|BA\|\| | Weyl perturbation theory | | 11 | LoRA rank (8) | Null-space capacity tail_dims | Shannon effective rank | | 12 | Target modules (q+v) | Spectral decay analysis | SVD per-layer | | 13 | Dropout (0.1) | Product of two spectral ratios | Roy & Vetterli 2007 | | 14 | Weight init (random A, zero B) | Spectral normalized to sigma_k | SVD | | 15 | Residual scaling (1) | Per-layer sigma_max(x) / sigma_max(f(x)) | Power iteration |

Full derivations with formulas: Geometric Hyperparameter Rosetta Stone

Quick Start

git clone https://github.com/Ethyros-AI/ModelCypher.git
cd ModelCypher
poetry install          # Python 3.11+
poetry run mc --help    # Verify CLI install

# Inspect a model's per-layer geometry
poetry run mc model info /path/to/model

# Build an observation bundle from one prompt
poetry run mc analyze capture --model /path/to/model --prompt "Explain geodesics."

# Build an observation bundle from a prompt family
poetry run mc analyze family --model /path/to/model --manifest data/probes/prompt_family_minimal_pairs.json

# Re-read an existing bundle and print the shared report view
poetry run mc analyze report --bundle /path/to/bundle

# Re-read a retained measurement-atlas artifact through the same report path
poetry run mc analyze report --bundle results/measurement_atlas/<run_id>

# Layer-wise intrinsic dimension profile
poetry run mc analyze dimension-profile --model /path/to/model --samples 50

# LoRA adapter spectral analysis
poetry run mc analyze lora-svd /path/to/adapter --base /path/to/model

# Train a LoRA adapter after inspecting the model
poetry run mc train run --model /path/to/model --data /path/to/data.jsonl --output /path/to/adapter

Evidence Snapshot

| Question | What retained artifacts show | Tag | |---|---|---| | Does the measurement layer exist as a real workflow? | Yes. mc analyze capture, mc analyze family, and mc analyze compare now emit observation bundles with machine-readable artifacts plus a short report | [EMPIRICAL] | | Canonical training path exists | mc train run is the shipped geometry-derived runtime path guarded by pipeline_gate_v1 | [EMPIRICAL] | | Does the retained 350M validation bundle close preservation? | No. results/pipeline_validation/verdict.json reports structural pass 5/5, inference pass 3/5, all_pass = false | [EMPIRICAL] | | Does the retained evidence close "better than standard practice"? | No. results/nblora_vs_standard/ is retained as summary_only, and the retained single-seed LFM2-350M summary does not support superiority of nb_lora over the kept baselines | [EMPIRICAL] | | Does the 8B bundle close efficacy? | No. results/g5_8b_validation_multiseed/multiseed_gates.json still fails cka_ok and degenerate_ok | [EMPIRICAL] | | Is quantization promising? | Yes as a measurement surface: results/quantization_frontier/20260227T235714Z/quantization_frontier.json shows PPL and degeneration improvement on all 3 retained models, but the frontier law is still open | [EMPIRICAL] |

Falsified Training Claims

| Hypothesis | Result | Tag | |------------|--------|-----| | REINFORCE on 350M | Gradient orthogonal to CE; degradation monotonic with steps | [DISPROVEN] | | SFT on reasoning traces | Format memorization: PPL drops, inference degrades | [DISPROVEN] | | Pullback metric P = MM^T | P ≈ I throughout training (median deviation 0.001) | [DISPROVEN] | | Stable rank predicts adapter rank | Pearson r = -0.51 vs tail_dims; measures different property | [DISPROVEN] | | Constrained training (paired) | Constraints monotonically hurt | [DISPROVEN] |

We publish failures because intellectual honesty is not optional. Current training blockers and exit criteria live in MISSION.md and RESEARCH-ROADMAP.md.

Measurement Toolkit

mc analyze is organized around five canonical workflows:

capture: measure one prompt or prompt file
family: run explicit minimal-pair or perturbation studies
compare: compare two targets on the same prompt family
report: read an existing bundle and render the shared high-signal view
probe: targeted probe and red-team workflows

Expert instruments remain directly callable when you want the underlying measurements without the bundle wrapper:

geometry and trajectory: reasoning-flow, geodesic-profile, entropy-trajectory, chain-profile, dimension-profile, jacobian-trace
probe and monitoring: calibrate-safety, jailbreak-test, probe-redteam, probe-behavioral, circuit-breaker, entropy-pattern
diagnostics and adapter analysis: lora-svd, benchmark, knowledge-type, curriculum-profile, crm-build, crm-compare

Full reference: CLI-REFERENCE.md

Architecture

Hexagonal (ports-and-adapters) with strict domain boundaries:

Core domain (core/domain/) — pure geometry and math, zero framework imports
Use cases (core/use_cases/) — orchestration, cannot import from adapters
Adapters (adapters/) — HuggingFace Hub, filesystem, model loading
**

ModelCypher

Install / Use

README

ModelCypher

The Thesis

Start By Measuring

Train When You Want To Act On The Measurements

What Gets Derived

Quick Start

Evidence Snapshot

Falsified Training Claims

Measurement Toolkit

Architecture