Dataroom
Framework based on a vector dabase to store, manage and curate large image datasets
Install / Use
/learn @Photoroom/DataroomREADME
DataRoom
<img src="./screenshot.jpg" alt="Screenshot of DataRoom UI" />DataRoom is a high-performance AI training data management platform featuring a beautiful UI, multimodal support (images, latents, masks), similarity search, and a Python client for seamless integration.
To try it out, follow the guide below. Also check out the Python client inside dataroom_client and the examples in notebooks.
Getting Started
The simplest and fastest way to get a Dataroom stack up and running is to use Docker. If you prefer to run it without Docker, see section Setup without Docker.
cp .env.example .env
cp backend/config/settings/local.example.py backend/config/settings/local.py
Build and Start Services
The following command builds and starts the Django, Postgres and OpenSearch containers:
docker compose up -d --build
Collect Static Files
The static files are built as part of the Django docker. To collect them, we run:
docker compose run --rm dataroom_django python manage.py collectstatic --link --clear --noinput
Run Database Migrations
docker compose run --rm dataroom_django python manage.py migrate
Setup OpenSearch Indices
docker compose run --rm dataroom_django python manage.py setup_opensearch --confirm
Create admin user
docker compose run --rm dataroom_django python manage.py createsuperuser --noinput --email admin@photoroom.dev
Access the application
Go to http://localhost:8000 and login with admin@photoroom.dev / admin
Quick overview:
- Full Application: http://localhost:8000 (Django serves pre-built React frontend)
- Django Admin: http://localhost:8000/admin/
- API Docs: http://localhost:8000/api/docs/
- OpenSearch: http://localhost:9200
Local Development
For active frontend development with instant Hot Module Replacement (HMR):
Prerequisites
- Node.js 22.14.0:
nvm use 22.14.0 - npm 10.9.2
Setup
1. Install Node.js version from .nvmrc:
nvm install && nvm use
2. Install frontend dependencies:
npm install
3. Enable development mode in Django settings.
Update backend/config/settings/local.py:
# FRONTEND
# ------------------------------------------------------------------------------
DJANGO_VITE_DEV_MODE = True
4. Start frontend dev server locally:
npm run dev # Runs on port 3000
# For port conflicts, override the port:
npm run dev -- --port 3001 # or any available port
5. Rebuild and start Django, Postgres and OpenSearch containers:
docker compose up -d --build
See Backend setup if you like to run the Django backend locally without Docker.
Run Tests
Run all the backend tests inside of the Django docker:
docker compose run --rm dataroom_django pytest
Pre-commit Hooks
Please install the pre-commit hooks for maintaining code quality:
pre-commit install --hook-type pre-commit
Other Useful Commands
Run production server:
docker compose run --service-ports dataroom_django ./scripts/run_web.sh
Run production background tasks:
docker compose run --service-ports dataroom_django ./scripts/run_tasks.sh
Reset database:
docker compose exec dataroom_postgres bash -c "su postgres -c 'dropdb dataroom && createdb dataroom'"
View logs:
docker compose logs -f dataroom_django
Restart specific service:
docker compose restart opensearch
Update poetry lock file after adding new dependencies to pyproject.toml:
./scripts/poetry-lock.sh
Static files in production
- The entries in
rollupOptionsinside vite.config.js define which entry points are going to be built. - Anything inside /frontend/public/ will simply be copied over. Use this for images included in the HTML.
- Running
npm run buildbuilds and bundles the frontend, generating amanifest.json. - Built files are now ready in
/backend/static_built/. - Running
python manage.py collectstaticcollects the static files, runs whitenoise, compressing and adding a hash to the filename. - Final static files are now ready to be served from
/backend/static_collected/.
Setup without Docker
<details> <summary>If you prefer to run the project on MacOS without Docker, follow these steps.</summary>Prerequisites
Install these prerequisites:
python@3.13.0virtualenvhttps://virtualenv.pypa.io/en/latest/installation.htmlpoetry@2.0.1https://python-poetry.org/docs/#installationnvmhttps://github.com/nvm-sh/nvm- Postgres v16 https://postgresapp.com/
brew install snappy
To use homebrew's openssl and snappy, add the following to your .zshrc:
export LDFLAGS="-L/opt/homebrew/opt/openssl@3/lib -L/opt/homebrew/Cellar/snappy/1.1.10/lib"
export CPPFLAGS="-I/opt/homebrew/opt/openssl@3/include -I/opt/homebrew/Cellar/snappy/1.1.10/include"
Database setup
Create the database:
createdb dataroom
Run OpenSearch:
docker compose up opensearch
python manage.py setup_opensearch
Backend setup
Use the correct python version from .python-version:
brew install pyenv
pyenv init
pyenv install
pyenv local
To create a virtualenv, inside the root project folder, run:
virtualenv .venv
To install all python requirements:
pip install poetry==1.7.1
poetry install
Copy and enable local settings:
cp backend/config/settings/local.example.py backend/config/settings/local.py
Remember to update the DATABASES settings in backend/config/settings/local.py to match your local database.
After setting up frontend, build the static files once:
npm run build
Collect the static files:
python manage.py collectstatic --link --clear --noinput
</details>Related Skills
node-connect
352.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.3kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
352.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
352.5kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
