ThyroidEarlyDetection
Early detection ML models for hyperthyroid data
Install / Use
/learn @ianrowan/ThyroidEarlyDetectionREADME
Early Detection of Hyperthyroid Episodes from Wearable Data
A machine learning system that detects hyperthyroid episodes 3-4 weeks before lab confirmation using Apple Watch and Whoop data. Analyzes resting heart rate deviation patterns to provide early warning, enabling proactive medication adjustment between blood tests.
If you are interested in the accompanied app the beta sign up is here with repo to follow
Overview
Adjusting thyroid medication between blood tests is challenging because symptoms often lag behind physiological changes. This project demonstrates that wearable sensor data contains detectable signals of hyperthyroid onset weeks before labs confirm it.
Key Results
- 3-4 week early warning before labeled hyperthyroid onset
- 97% recall on confirmed hyperthyroid episodes (with SMA-4 smoothing)
- Only 3 features needed: RHR deviation from 14-day baseline, 30-day baseline, and delta
- Model flags transition windows that later prove to be episode onset
How It Works
The model outputs a continuous risk score [0, 1] every 5 days. When resting heart rate begins deviating from personal baselines, the risk score rises - often weeks before symptoms or labs would indicate a problem.
Aug 03: Risk 0.53 <- Model alerts (labeled "normal")
Aug 08: Risk 0.73 <- Model alerts (labeled "normal")
Aug 13: Risk 0.57 <- Model alerts (labeled "normal")
Aug 30: Risk 0.44 <- Labeled hyper onset confirmed by labs
The "false positives" in early August were actually correct early detections.
Installation
git clone https://github.com/ianrowan/thyroid-ml.git
cd thyroid-ml
python -m venv venv
source venv/bin/activate # or `venv\Scripts\activate` on Windows
pip install -r requirements.txt
Requirements
- Python 3.10+
- Apple Health export (XML format)
- Historical labels for training (date ranges with severity)
Usage
Data Preparation
- Export Apple Health data from iPhone: Health app > Profile > Export All Health Data
- Extract the export to
data/apple_health_export/
# Parse Apple Health export
python src/parse_health_export.py
# Extract 5-day window features
python src/feature_extraction.py
# Generate visualization for labeling
python src/visualize_for_labeling.py
Training
Requires data/labels.csv with columns: start_date, end_date, state, confidence
# Train production models
python -m src.save_models
# Or train with experiment tracking
python -m src.train --model xgboost
# View experiment results
mlflow ui
Inference
# Run inference on Apple Health export
python -m src.infer --input data/apple_health_export/export.xml
# Show predictions from a specific date
python -m src.infer --input export.xml --since 2025-12-01
# Show more history in trajectory
python -m src.infer --input export.xml --windows 10
Architecture
Data Pipeline
Apple Health XML (2.8GB)
|
v parse_health_export.py (streaming, ~2min)
Parquet files per signal type
|
v feature_extraction.py
5-day window features (63 features)
|
v + labels.csv
Training data (temporal split)
|
v train.py / save_models.py
Production models
Feature Engineering
For each 5-day window, the system computes:
- Central tendency: mean, median
- Variability: std, IQR, coefficient of variation
- Extremes: min, max, 5th/95th percentiles
- Trends: linear regression slope, delta from prior window
- RHR-specific: deviation from 14-day and 30-day baselines
- Sleep: total time, efficiency, stage breakdown
Model Architecture
EarlyDetectionModel (XGBoost binary classifier)
- Trained on 3 features:
rhr_deviation_14d,rhr_deviation_30d,rhr_delta - Outputs continuous probability [0, 1]
- Threshold 0.35 provides 3-week early warning (tunable for sensitivity)
- SMA-4 smoothing reduces isolated false positives by 24%
Key Insight
The model detects when resting heart rate begins deviating from personal baselines. This deviation precedes other symptoms and lab changes by weeks, making it the primary signal for early detection.
Limitations
- Single subject: Results are from one individual's data; generalization to others not validated
- Device transitions: May need recalibration when switching wearables (Apple Watch to Whoop, etc.)
- Baseline dependency: Requires stable "normal" periods to establish personal baselines
- Not a diagnostic tool: Intended to prompt earlier lab testing, not replace medical evaluation
iOS App
The ios/ directory contains a SwiftUI app that runs the trained model on-device using CoreML. It reads resting heart rate from HealthKit and displays a daily risk score with trend charts.
To use the app with your own model:
# Train your model
python -m src.save_models
# Convert to CoreML
pip install coremltools
python convert_to_coreml.py
# Copy into the iOS app
cp ThyroidEarlyDetection.mlmodel ios/ThyroidDetect/ThyroidDetect/ThyroidEarlyDetection.mlmodel
Then open ios/ThyroidDetect/ThyroidDetect.xcodeproj in Xcode, set your signing team, and build to your iPhone. See ios/README.md for full setup instructions.
Repository Structure
thyroid-ml/
├── src/
│ ├── parse_health_export.py # Streaming XML parser
│ ├── feature_extraction.py # Window feature aggregation
│ ├── dataset.py # Label loading, temporal splits
│ ├── models.py # RandomForest, XGBoost, Semi-supervised
│ ├── sequence_models.py # LSTM/GRU (experimental)
│ ├── train.py # Training with MLflow tracking
│ ├── save_models.py # Production model export
│ └── infer.py # CLI inference
├── ios/ # iOS app (SwiftUI + CoreML)
│ ├── ThyroidDetect/ # Xcode project
│ ├── docs/ # iOS-specific documentation
│ └── README.md # iOS setup & model loading guide
├── convert_to_coreml.py # Export model to CoreML format
├── data/ # Health data & labels (gitignored)
├── models/ # Saved model artifacts (gitignored)
├── mlruns/ # MLflow experiments (gitignored)
├── research.md # Detailed research documentation
└── requirements.txt
Research Documentation
See research.md for:
- Complete experiment results and hyperparameter tuning
- Signal analysis explaining detection limits
- Smoothing experiment results
- Model architecture rationale
License
MIT License - See LICENSE file for details.
Disclaimer
This software is for research purposes only and is not intended for medical diagnosis or treatment decisions. Always consult with healthcare providers for thyroid management.
Citation
If you use this work in your research, please cite:
@software{thyroid_ml,
title = {Early Detection of Hyperthyroid Episodes from Wearable Data},
year = {2026},
url = {https://github.com/ianrowan/thyroid-ml}
}
Contributing
Contributions are welcome. Please open an issue to discuss proposed changes before submitting a pull request.
Related Skills
node-connect
351.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
