Dataasee
DatAasee - A Metadata-Lake for Libraries
Install / Use
/learn @ulbmuenster/DataaseeREADME
DatAasee (0.5)
DatAasee centralizes and interlinks distributed library/research metadata into an API‑first union catalog.

A Metadata-Lake for Libraries
Repository: github.com/ulbmuenster/dataasee (nb sources backup)
Maintainer: Christian Himpe (at University and State Library of Münster)
Licenses: MIT (add. CC-BY for openapi.yaml)
Function: Metadata-Lake, Metadata Catalog, Metadata Aggregator, Union Catalog
Audience: University Libraries, Research Libraries, Academic Libraries, Scientific Libraries
Documentation
- Dependencies Overview
- Software Documentation
- Architecture Documentation
- Database Schema
- OpenAPI Schema (Swagger UI)
DatAasee: A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake (Companion Paper, Open Access)
Getting Started (Deployment)
Quick Start (Prepare a dedicated directory, inside run:)
$ wget https://raw.githubusercontent.com/ulbmuenster/dataasee/0.5/compose.yaml
$ mkdir -p -m 766 backup
$ DL_PASS=password1 DB_PASS=password2 docker compose up
Web: http://localhost:8000 (API: http://localhost:8343/api/v1/ )
- Depends on
docker compose(and compatible todockerandpodman) - To deploy, no need to clone, just use the
compose.yamlfile. - See the Deploy Documentation for details.
Tech Stack Canvas
- Setting: Many distributed data and metadata sources
- Goals:
- Centralize metadata
- Interlinked metadata catalog
- Super-index for bibliographic and research data
- Features:
- Interact through HTTP-API (JSON)
- Search by filter, full-text, source, doi
- Custom query via:
SQL,Gremlin,Cypher,MQL,GraphQL
- Frontend: Lowdefy (Optional)
- Backend: Connect (fmr. Benthos)
- Data Storage: ArcadeDB (Graph Database)
- Infrastructure: Compose (via Docker or Podman)
- Deployment: via Harbor (at Uni Münster)
- Monitoring: Container Logs (local logging driver)
- Integrations:
- Protocols:
OAI-PMH(HTTP),S3(HTTP),GET(HTTP),DatAasee(HTTP) - Encodings:
XML(Plain-Text) - Formats:
DataCite(XML),DC(XML),LIDO(XML),MARC(XML),MODS(XML)
- Protocols:
- Exports:
DataCite(JSON),BibJSON(JSON) - Security: Privileged endpoints (CQRS)
- Testing: check-jsonschema
- Development: Github
Default Ports
8343DatAasee API8000Web Frontend2480Database API (Development Container Images Only)9999Database JMX (Development Container Images Only)
API Cheat Sheet
GETapi/v1/apiReturns API specification and schemas.GETapi/v1/readyReturns service readiness.GETapi/v1/metadataReturns queried metadata records.GETapi/v1/sourcesReturns ingested metadata sources.GETapi/v1/schemaReturns database schema.GETapi/v1/enumsReturns enumerated attributes.GETapi/v1/statsReturns metadata record statistics.POSTapi/v1/backupTriggers database backup.POSTapi/v1/ingestTriggers async ingest of metadata.POSTapi/v1/insertInserts single metadata record.POSTapi/v1/healthProbes and returns service liveness.
Repository Contents
api/API definition and message schemasassets/Logos and style definitionbackend/Processor pipeline and component definitionscontainer/Dockerfilesdatabase/Database initialization, schemas and enumerated datadocs/Documentation of software, data and architecturefrontend/Prototype frontend definitiontests/Test definitions and data
Getting Started (Development)
- Available
maketargets:make setupBuild server images (builds development images)make startStart serversmake stopStop serversmake resetStop and start serversmake buildBuild release images (passREGISTRY=to set container image registry)make emptyDelete database backupsmake logsShow logs (requiresgrep)make peakReport peak database memory usage (requiresgrep)make testRun tests (requirescheck-jsonschema,busybox,wget)make tidyList violations of StrictYAML (requiresyamllint)make todoList inline TODOs in repo (requiresgrep)
- Custom
makevariable:COMPOSE(set Compose implementation)
Contributors
tl;dr
DatAasee is centralized Metasearch for distributed Metadata.
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
