SkillAgentSearch skills...

Harmony

The Harmony Python library: a research tool for psychologists to harmonise data and questionnaire items. Open source.

Install / Use

/learn @harmonydata/Harmony

README

The Harmony Project logo

<a href="https://harmonydata.ac.uk"><span align="left">🌐 harmonydata.ac.uk</span></a> <a href="https://www.linkedin.com/company/harmonydata"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/linkedin.svg" alt="Harmony | LinkedIn" width="21px"/></a> <a href="https://twitter.com/harmony_data"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/x.svg" alt="Harmony | X" width="21px"/></a> <a href="https://www.instagram.com/harmonydata/"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/instagram.svg" alt="Harmony | Instagram" width="21px"/></a> <a href="https://www.facebook.com/people/Harmony-Project/100086772661697/"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/fb.svg" alt="Harmony | Facebook" width="21px"/></a> <a href="https://www.youtube.com/channel/UCraLlfBr0jXwap41oQ763OQ"><img align="left" src="https://raw.githubusercontent.com//harmonydata/.github/main/profile/yt.svg" alt="Harmony | YouTube" width="21px"/></a>

Harmony on Twitter

Harmony Python library

<!-- badges: start -->

PyPI package my badge License tests Current Release Version pypi Version version number PyPi downloads forks docker

You can also join our Discord server! If you found Harmony helpful, you can leave us a review!

What does Harmony do?

  • Psychologists and social scientists often have to match items in different questionnaires, such as "I often feel anxious" and "Feeling nervous, anxious or afraid".
  • This is called harmonisation.
  • Harmonisation is a time consuming and subjective process.
  • Going through long PDFs of questionnaires and putting the questions into Excel is no fun.
  • Enter Harmony, a tool that uses natural language processing and generative AI models to help researchers harmonise questionnaire items, even in different languages.

Quick start with the code

Read our guide to contributing to Harmony here or read CONTRIBUTING.md.

You can run the walkthrough Python notebook in Google Colab with a single click: <a href="https://colab.research.google.com/github/harmonydata/harmony/blob/main/Harmony_example_walkthrough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

You can also download an R markdown notebook to run in R Studio: <a href="https://harmonydata.ac.uk/harmony_r_example.nb.html" target="_parent"><img src="https://img.shields.io/badge/RStudio-4285F4" alt="Open In R Studio"/></a>

You can run the walkthrough R notebook in Google Colab with a single click: <a href="https://colab.research.google.com/github/harmonydata/experiments/blob/main/Harmony_R_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> View the PDF documentation of the R package on CRAN

Looking for examples?

Check out our examples repository at https://github.com/harmonydata/harmony_examples

<!-- badges: end -->

The Harmony Project

Harmony is a tool using AI which allows you to compare items from questionnaires and identify similar content. You can try Harmony at https://harmonydata.ac.uk/app and you can read our blog at https://harmonydata.ac.uk/blog/.

Who to contact?

You can contact Harmony team at https://harmonydata.ac.uk/, or Thomas Wood at https://fastdatascience.com/.

🖥 Installation instructions (video)

Installing Harmony

🖱 Looking to try Harmony in the browser?

Visit: https://harmonydata.ac.uk/app/

You can also visit our blog at https://harmonydata.ac.uk/

✅ You need Tika if you want to extract instruments from PDFs

Download and install Java if you don't have it already. Download and install Apache Tika and run it on your computer https://tika.apache.org/download.html

java -jar tika-server-standard-2.3.0.jar

Requirements

You need a Windows, Linux or Mac system with

  • Python 3.8 or above
  • the requirements in requirements.txt
  • Java (if you want to extract items from PDFs)
  • Apache Tika (if you want to extract items from PDFs)

🖥 Installing Harmony Python package

You can install from PyPI.

pip install harmonydata

Loading all models

Harmony uses spaCy to help with text extraction from PDFs. spaCy models can be downloaded with the following command in Python:

import harmony
harmony.download_models()

Matching example instruments

instruments = harmony.example_instruments["CES_D English"], harmony.example_instruments["GAD-7 Portuguese"]
match_response = harmony.match_instruments(instruments)

questions = match_response.questions
similarity = match_response.similarity_with_polarity

How to load a PDF, Excel or Word into an instrument

harmony.load_instruments_from_local_file("gad-7.pdf")

Optional environment variables

As an alternative to downloading models, you can set environment variables so that Harmony calls spaCy on a remote server. This is only necessary if you are making a server deployment of Harmony.

  • HARMONY_DATA_PATH - determines where data files are stored. Defaults to HOME DIRECTORY/harmony
  • HARMONY_NO_PARSING - set to 1 to import a lightweight variant of Harmony which doesn't support PDF parsing.
  • HARMONY_NO_MATCHING - set to 1 to import a lightweight variant of Harmony which doesn't support matching.

Creating instruments from a list of strings

You can also create instruments quickly from a list of strings

from harmony import create_instrument_from_list, match_instruments
instrument1 = create_instrument_from_list(["I feel anxious", "I feel nervous"])
instrument2 = create_instrument_from_list(["I feel afraid", "I feel worried"])

match_response = match_instruments([instrument1, instrument2])

Loading instruments from PDFs

If you have a local file, you can load it into a list of Instrument instances:

from harmony import load_instruments_from_local_file
instruments = load_instruments_from_local_file("gad-7.pdf")

📋 Importing from Google Forms

Harmony can import questionnaires directly from Google Forms URLs, allowing you to harmonise survey instruments that are hosted on Google Forms.

Setup

To use Google Forms integration, you need a Google API key:

  1. Visit the Google Cloud Console
  2. Create a new project or select an existing one
  3. Enable the Google Forms API for your project
  4. Create credentials (API key) for the Google Forms API
  5. Set the API key as an environment variable:
export GOOGLE_FORMS_API_KEY="your-api-key-here"

Usage

Import questionnaires from Google Forms using the URL or form ID:

from harmony import convert_files_to_instruments
from harmony.schemas.requests.text import RawFile
from harmony.schemas.enums.file_types import FileType

# Create a RawFile with the Google Forms URL
file = RawFile(
    file_name="Customer Satisfaction Survey",
    file_type=FileType.google_forms,
    content="https://docs.google.com/forms/d/e/1FAIpQLSc.../viewform"
)

# Convert to Harmony instruments
instruments = convert_files_to_instruments([file])

# Access the questions
for instrument in instruments:
    print(f"Form: {instrument.instrument_name}")
    for question in instrument.questions:
        print(f"{question.question_no}. {question.question_text}")
        if question.options:
            print(f"   Options: {', '.join(question.options)}")

You can also use the form ID directly instead of the full URL:

file = RawFile(
    file_name="Survey",
    file_type=FileType.google_forms,
    content="1FAIpQLSc_form_id_here"
)

Supported Question Type

View on GitHub
GitHub Stars54
CategoryData
Updated14d ago
Forks54

Languages

Python

Security Score

100/100

Audited on Mar 18, 2026

No findings