SkillAgentSearch skills...

Medical Calc MCP

πŸ₯ MCP Server with 121 validated medical calculators for AI agents. DDD architecture, evidence-based formulas with PMID citations. Supports Claude, GPT, and other LLM integrations.

Install / Use

/learn @u9401066/Medical Calc MCP

README

Medical Calculator MCP Server πŸ₯

A DDD-architected medical calculator service providing clinical scoring tools for AI Agent integration via MCP (Model Context Protocol).

ηΉι«”δΈ­ζ–‡η‰ˆ (Traditional Chinese)

Python 3.11+ MCP SDK License CI Tests References uv Code Style Architecture PRs Welcome


πŸ“– Table of Contents


🎯 Features

  • πŸ”Œ MCP Native Integration: Built with FastMCP SDK for seamless AI agent integration
  • πŸ” Intelligent Tool Discovery: Two-level key system + Tool Relation Graph (Hypergraph) for smart tool selection
  • πŸ›‘οΈ Smart Parameter Matching: Alias support, fuzzy matching, and typo tolerance
  • ⚠️ Boundary Validation: Literature-backed clinical range checking with automatic warnings
  • πŸ—οΈ Clean DDD Architecture: Onion architecture with clear separation of concerns
  • πŸ“š Evidence-Based: All 121 calculators cite peer-reviewed research (100% coverage, Vancouver style)
  • πŸ”’ Type Safe: Full Python type hints with dataclass entities
  • 🌐 Bilingual: Chinese/English documentation and tool descriptions

πŸ€” Why This Project?

The Problem

When AI agents (like Claude, GPT) need to perform medical calculations, they face challenges:

  1. Hallucination Risk: LLMs may generate incorrect formulas or values
  2. Version Confusion: Multiple versions of same calculator (e.g., MELD vs MELD-Na vs MELD 3.0)
  3. No Discovery Mechanism: How does an agent know which tool to use for "cardiac risk assessment"?

The Solution

This project provides:

| Feature | Description | |---------|-------------| | Validated Calculators | Peer-reviewed, tested formulas | | Tool Discovery | AI can search by specialty, condition, or clinical question | | MCP Protocol | Standard protocol for AI-tool communication | | Paper References | Every calculator cites original research |

πŸ§ͺ Development Methodology

We employ a human-in-the-loop, AI-augmented workflow to ensure clinical accuracy:

  1. Domain Specification: Human experts define the target medical specialty or clinical domain.
  2. AI-Driven Search: AI agents perform comprehensive searches for the latest clinical guidelines and consensus.
  3. Guideline Extraction: Systematically identify recommended scoring systems and calculations mentioned in those guidelines.
  4. Source Validation: Trace back to original peer-reviewed primary papers to verify exact formulas and coefficients.
  5. Implementation: Develop validated calculation tools with precise parameters and evidence-based interpretations.

πŸ”¬ Research Framework

This project implements a Neuro-Symbolic Framework for reliable medical calculation, combining LLM understanding with validated symbolic computation.

Academic Positioning

| Challenge | Traditional LLM | Our Solution | | --------- | --------------- | ------------ | | Calculation Accuracy | ~50% (MedCalc-Bench) | >95% via validated formulas | | Parameter Extraction | Vocabulary mismatch | ParamMatcher (60+ aliases) | | Safety Guardrails | No clinical constraints | BoundaryValidator (PMID-backed) | | Tool Discovery | Keyword/RAG only | Two-Level Key + Hypergraph |

Three-Module Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     NEURO-SYMBOLIC MEDICAL REASONING                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚  Discovery Engine β”‚ β†’ β”‚ Reasoning Interfaceβ”‚ β†’ β”‚    Safety Layer   β”‚     β”‚
β”‚  β”‚  (Tool Selection) β”‚   β”‚  (Param Matching)  β”‚   β”‚  (Validation)     β”‚     β”‚
β”‚  β”‚                   β”‚   β”‚                    β”‚   β”‚                   β”‚     β”‚
β”‚  β”‚  β€’ High/Low Keys  β”‚   β”‚  β€’ Alias Matching  β”‚   β”‚  β€’ Range Check    β”‚     β”‚
β”‚  β”‚  β€’ Hypergraph     β”‚   β”‚  β€’ Fuzzy Match     β”‚   β”‚  β€’ PMID Citation  β”‚     β”‚
β”‚  β”‚  β€’ Context-Aware  β”‚   β”‚  β€’ Multi-lingual   β”‚   β”‚  β€’ Error Messages β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                                                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Contributions

  1. Semantic Parameter Mapping (ParamMatcher): Resolves vocabulary mismatch between clinical text and calculator parameters through alias tables, fuzzy matching, and suffix normalization.

  2. Literature-Based Guardrails (BoundaryValidator): Validates input values against clinically impossible ranges derived from peer-reviewed literature (17+ parameters with PMID citations).

  3. Context-Aware Tool Discovery: Two-level key system + Clinical Knowledge Graph for intelligent tool recommendation based on clinical context.

πŸ† Levels of Academic Value

| Level | Contribution | Scholarly Focus | | ----- | ------------ | --------------- | | L1 | Validated Symbolic Engine | Extends LLM with deterministic precision | | L2 | Hierarchical Tool Discovery | Solves RAG precision in high-stakes domains | | L3 | Robust Semantic Extraction | Resolves the "Vocabulary Mismatch" problem | | L4 | Knowledge-Gated Safety Layer | Unique: Literature-derived constraint verification | | L5 | Clinical Hypergraph Agent | Cross-specialty workflow reasoning |

πŸ“„ For detailed research roadmap and benchmark strategy, see ROADMAP.md


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    infrastructure/mcp/                       β”‚
β”‚                (MCP Server, Handlers, Resources)             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  MedicalCalculatorServer                             β”‚    β”‚
β”‚  β”‚  β”œβ”€β”€ handlers/DiscoveryHandler (discover, list...)   β”‚    β”‚
β”‚  β”‚  β”œβ”€β”€ handlers/CalculatorHandler (calculate_*)        β”‚    β”‚
β”‚  β”‚  └── resources/CalculatorResourceHandler             β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚ uses
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     application/                             β”‚
β”‚               (Use Cases, DTOs, Validation)                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  DiscoveryUseCase, CalculateUseCase                  β”‚    β”‚
β”‚  β”‚  DiscoveryRequest/Response, CalculateRequest/Responseβ”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚ depends on
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       domain/                                β”‚
β”‚            (Entities, Services, Value Objects)               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚  BaseCalculator, ToolMetadata, ScoreResult          β”‚    β”‚
β”‚  β”‚  LowLevelKey, HighLevelKey, ToolRegistry            β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                    【Core, Zero Dependencies】                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Design Decisions

| Decision | Rationale | |----------|-----------| | DDD Onion | Domain logic isolated from infrastructure | | FastMCP | Native Python MCP SDK, simple decorator-b

View on GitHub
GitHub Stars3
CategoryDevelopment
Updated22h ago
Forks0

Languages

Python

Security Score

90/100

Audited on Mar 19, 2026

No findings