BabyDoctor

The AI Radiologist You Can Chat With

Generate Convert Improve

Install / Use

/learn @photomz/BabyDoctor

About this skill

Quality Score

0/100

README

🩻 BabyDoctor

The AI Radiologist You Can Chat With

Welcome to BabyDoctor, your personal "Ultrasound Radiologist in a Box"! Let's face it, most of us try to avoid seeing the doctor as much as we can, especially when it involves cryptic ultrasound scans. BabyDoctor is here to bridge the gap and demystify medical jargon for you.

BabyDoctor uses a LLaVA (Large Language and Vision Assistant) to generate ultrasound analysis. It's a combination of the cutting-edge LLaMa 2 text generator and OpenAI's CLiP for image embedding. The model was fine-tuned for ultrasound scans with a dataset of 65,000 text-image pairs, and trained using a 4-bit quantised LoRA on a Lambda Labs' A10 GPU for 8 hours.

🚼 Reproduce

To reproduce the results with BabyDoctor, follow these steps on a system with at least 16 vCPUs, 32GB RAM, and a NVIDIA GPU of >12GB VRAM:

Clone the repository: git clone https://github.com/<username>/babydoctor.git
Install CUDA following the official NVIDIA setup instructions.
Install Conda.
Run mkdir -p ~/git; cd ~/git.
Clone this repository into ~/git/BabyDoctor.
Run conda env create -f BabyDoctor/llmenv.yaml. This will take a while.
Run conda activate llmforbio. From this step onward, execute all commands under this environment.
Run MAX_JOBS=8 python3 -m pip install flash-attn.
Download the dataset: git clone https://github.com/razorx89/roco-dataset; cd roco-dataset; python3 scripts/fetch.py; popd.
Prepare training data: python3 BabyDoctor/scripts/massage_data.py.
Start fine-tuning: mv BabyDoctor/finetune.sh .; bash finetune.sh. This took 8 hours on the A10.
Modify and run ./BabyDoctor/scripts/inference.sh to prompt it!

A Web UI is available following instructions from the BabyDoctor repository.

🧪 Curious?

Try running BabyDoctor on your own ultrasound scans or experiment with different prompts. Let's see how well BabyDoctor can bridge the language gap between medical jargon and everyday English for you. You might be surprised!

And, of course, contributions to improve BabyDoctor are always welcome.

Check out these links for more details:

🤝 Contributing

We welcome contributions to BabyDoctor! If you have a feature request, bug report, or proposal, please submit an issue. If you wish to contribute code, please fork this repository and submit a pull request.

📜 License

BabyDoctor is subject to the licenses of Meta's LLaMa 2, OpenAI's CLiP, OpenAI's GPT-4 User License Agreement, and LLaVa. Our data, code and checkpoints is intended and licensed for research use only.

Attribution is appreciated but not necessary:

@misc{photomz2023,
  author = {Markus Zhang, Vir Chau},
  title = {BabyDoctor},
  year = {2023},
  howpublished = {\url{https://github.com/photomz/BabyDoctor}},
  note = {GitHub}
}

Related Skills

node-connect

353.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

353.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

353.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。