Labelpandas
Labelbox Connector for Pandas
Install / Use
/learn @Labelbox/LabelpandasREADME
[!WARNING] Starting in July 2024, we will begin archiving all data connector libraries and they will no longer be maintained, including
labelspark,labelpandas,labelsnow, andlabelbox-bigquerylibraries. To import data from remote sources such as Databricks and Snowflake, set up Census integrations directly on the Labelbox platform.
The Official Open-Source Labelbox <> Pandas Python Integration
Labelbox enables teams to maximize the value of their unstructured data with its enterprise-grade training data platform. For ML use cases, Labelbox has tools to deploy labelers to annotate data at massive scale, diagnose model performance to prioritize labeling, and plug in existing ML models to speed up labeling. For non-ML use cases, Labelbox has a powerful catalog with auto-computed similarity scores that users can use add metadata tags to large amounts of data with a couple clicks.
Pandas stands as the premier open-source Python library for handling CSV and tabluar data and as one of the most widely used Python libraries in the world.
This GitHub repo stands as an open-source Python library, moderated by the Labelbox Solutions team, in facilitating Labelbox users in uploading data to Labelbox and retreiving data from Labelbox in tabular / CSV format using Pandas.
We strongly encourage collaboration - please free to fork this repo and tweak the code base to work for you own data, and make pull requests if you have suggestions on how to enhance the overall experience, add new features, or improve general performance.
Please report any issues/bugs via Github Issues.
Table of Contents
Requirements
Setup
Set up LabelPandas with the following lines of code:
!pip install labelpandas -q
import labelpandas as lp
api_key = "" # Insert your Labelbox API key here
client = lp.Client(api_key)
Once set up, you can run the following core functions:
-
client.create_data_rows_from_table(): Creates Labelbox data rows (and metadata) given a Pandas table -
client.export_to_table(): Exports labels (and metadata) from a given Labelbox project and creates a Pandas DataFrame
Example Notebooks
Importing Data from a CSV
| Notebook | Github | Google Colab |
| ------------------------------ | -------- | ----------------- |
| Basics: Data Rows from URLs |
|
|
| Data Rows from Raw Text* |
|
|
| Data Rows from Local Files |
|
|
| Data Rows with Metadata |
|
|
| Data Rows with Attachments |
|
|
| Data Rows with Annotations |
|
|
| Putting it all Together |
|
|
Exporting Data to a CSV
| Notebook | Github | Google Colab |
| ------------------------------ | -------- | ----------------- |
| Exporting Data to a CSV |
|
|
- = Coming soon
Provenance
To enhance the software supply chain security of Labelbox's users, as of 0.1.4, every release contains a SLSA Level 3 Provenance document.
This document provides detailed information about the build process, including the repository and branch from which the package was generated.
By using the SLSA framework's official verifier, you can verify the provenance document to ensure that the package is from a trusted source. Verifying the provenance helps confirm that the package has not been tampered with and was built in a secure environment.
Example of usage for the 0.1.4 release wheel:
VERSION=0.1.4 #tag
gh release download ${VERSION} --repo Labelbox/labelpandas
slsa-verifier verify-artifact --source-branch main --builder-id 'https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v2.0.0' --source-uri "git+https://github.com/Labelbox/labelpandas" --provenance-path multiple.intoto.jsonl ./labelpandas-${VERSION}-py3-none-any.whl
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
