Femr

FEMR (Framework for Electronic Medical Records) provides tooling for large-scale, self-supervised learning using electronic health records

Generate Convert Improve

Install / Use

/learn @som-shahlab/Femr

About this skill

Quality Score

0/100

README

FEMR

Framework for Electronic Medical Records

FEMR is a Python package for manipulating longitudinal EHR data for machine learning, with a focus on supporting the creation of foundation models and verifying their presumed benefits in healthcare. Such a framework is needed given the current state of large language models in healthcare and the need for better evaluation frameworks.

The currently supported foundation models is MOTOR.

(Users who want to train auto-regressive CLMBR-style models should use FEMR 0.1.16 or https://github.com/som-shahlab/hf_ehr)

FEMR works with data that has been converted to the MEDS schema, a simple schema that supports a wide variety of EHR / claims datasets. Please see the MEDS documentation, and in particular its provided ETLs for help converting your data to MEDS.

FEMR helps users:

Use ontologies to better understand / featurize medical codes
Algorithmically label subject records based on structured data
Generate tabular features from subject timelines for use with traditional gradient boosted tree models
Train and finetune MOTOR-derived models for binary classification and prediction tasks.

We recommend users start with our tutorial folder

Installation

pip install femr

# If you are using deep learning, you also need to install xformers
#
# Note that xformers has some known issues with MacOS.
# If you are using MacOS you might also need to install llvm. See https://stackoverflow.com/questions/60005176/how-to-deal-with-clang-error-unsupported-option-fopenmp-on-travis
pip install xformers

Getting Started

The first step of using FEMR is to convert your subject data into MEDS, the standard input format expected by FEMR codebase.

Note: FEMR currently only supports MEDS v3, so you will need to install MEDS v3 versions of packages. Aka pip install meds-etl==0.3.11

The best way to do this is with the ETLs provided by MEDS.

OMOP Data

If you have OMOP CDM formated data, follow these instructions:

Download your OMOP dataset to [PATH_TO_SOURCE_OMOP].
Convert OMOP => MEDS using the following:

# Convert OMOP => MEDS data format
meds_etl_omop [PATH_TO_SOURCE_OMOP] [PATH_TO_OUTPUT_MEDS]

Stanford STARR-OMOP Data

If you are using the STARR-OMOP dataset from Stanford (which uses the OMOP CDM), we add an initial Stanford-specific preprocessing step. Otherwise this should be identical to the OMOP Data section. Follow these instructions:

Download your STARR-OMOP dataset to [PATH_TO_SOURCE_OMOP].
Convert STARR-OMOP => MEDS using the following:

# Convert OMOP => MEDS data format
meds_etl_omop [PATH_TO_SOURCE_OMOP] [PATH_TO_OUTPUT_MEDS]_raw

# Apply Stanford fixes
femr_stanford_omop_fixer [PATH_TO_OUTPUT_MEDS]_raw [PATH_TO_OUTPUT_MEDS]

Development

The following guides are for developers who want to contribute to FEMR.

Precommit checks

Before committing, please run the following commands to ensure that your code is formatted correctly and passes all tests.

Installation

conda install pre-commit pytest -y
pre-commit install

Running

Test Functions

pytest tests

Formatting Checks

pre-commit run --all-files

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

openclaw-plugin-loom

Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)

Leadership-Mirror

Product Overview Project Purpose Hack Atria is a leadership development and team management platform that provides AI-powered insights, feedback analysis, and learning resources to help leaders

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

som-shahlab

View profile

View on GitHub

GitHub Stars167

CategoryEducation

Updated17d ago

Forks31

som-shahlab/femr

Languages

Python

Security Score

100/100

Audited on Mar 5, 2026

No findings