NemakiWare
Light-weight, highly customizable Permission-aware document repository for RAG
Install / Use
/learn @aegif/NemakiWareREADME
NemakiWare
Permission-aware document repository for RAG — an open source platform that stores documents with fine-grained access control and makes them searchable via semantic vector search, ready to plug into any LLM pipeline.
Why NemakiWare?
Building RAG on top of file storage or generic databases means bolting on permissions after the fact. NemakiWare solves this at the repository layer: every document, every chunk, every search result is governed by the same ACL model. Your LLM only sees what the requesting user is allowed to see.
- ACL-filtered semantic search — vector search results are filtered by the current user's permissions in real time
- Automatic chunking & embedding — upload a document and it is chunked, embedded, and indexed with zero extra work
- MCP server built in — connect Claude, ChatGPT, or any MCP-compatible agent directly to your repository
- Bring your own embeddings — Hugging Face TEI (self-hosted) or Amazon Bedrock (managed)
- Full document lifecycle — versioning, relationships, retention, archival to S3 cold storage
- Modern React UI — browse, search, manage users/groups, configure everything from the browser
Quick Start
Prerequisites
- Docker and Docker Compose
- 4GB+ available memory (16GB+ if enabling the self-hosted embedding server)
1. Build
# Install OpenCMIS JARs to local Maven repository (first build only)
./scripts/install-opencmis-local.sh
# Build UI
cd core/src/main/webapp/ui && npm install && npm run build && cd ../../../..
# Build server
mvn clean package -f core/pom.xml -Pdevelopment -DskipTests -q
# Copy WAR to Docker directory
cp core/target/core.war docker/core/core.war
2. Start
cd docker
# Core services (CouchDB + Solr + NemakiWare)
docker compose -f docker-compose-simple.yml up -d --build
# With self-hosted embedding server (TEI)
docker compose -f docker-compose-simple.yml --profile rag up -d --build
| Service | Port | Description | |---------|------|-------------| | NemakiWare | 8080 | Repository server + React UI | | CouchDB | 5984 | Document database | | Solr | 8983 | Full-text & vector search | | TEI | 8081 | Embedding server (rag profile) |
3. Open
- UI: http://localhost:8080/core/ui/
- Credentials:
admin/admin
A Setup Wizard runs on first launch to configure database, authentication, and embedding provider.
Features
Semantic Search (RAG)
Upload documents and search by meaning, not just keywords.
- Hybrid search: combines keyword full-text search with vector similarity
- Supported formats: PDF, Word, Excel, PowerPoint, HTML, XML, plain text
- Configurable weighting: property boost (metadata) vs content boost (document body)
- Folder-scoped search: restrict results to a specific folder tree
- Similar documents: find documents related to a given document
- Rate limiting: per-user token bucket (configurable)
- Admin tools: full reindex, folder reindex, index health monitoring, search-as-user testing
Permission Model
Every search result is checked against the requesting user's permissions before being returned.
- CMIS ACL (Access Control List) on every object
- Inherited permissions from parent folders
- User/group-based access control
- Admin simulation mode for verifying what a specific user can see
MCP Server
NemakiWare exposes an MCP (Model Context Protocol) server so AI agents can directly search and retrieve documents.
| Tool | Description |
|------|-------------|
| nemakiware_login | Authenticate (username/password, API key, or OIDC) |
| nemakiware_search | Full-text keyword search |
| nemakiware_rag_search | Semantic vector search |
| nemakiware_similar_documents | Find similar documents |
| nemakiware_get_document_content | Retrieve document content |
Protocol: JSON-RPC 2.0 via HTTP/SSE.
Embedding Providers
| Provider | Type | Notes |
|----------|------|-------|
| Hugging Face TEI | Self-hosted | Default. Ships as a Docker service. Uses intfloat/multilingual-e5-large (1024 dim) |
| Amazon Bedrock | Managed (Beta) | Titan Embedding V2. IAM role or explicit credentials. See Bedrock guide |
Authentication
- Password (BCrypt)
- WebAuthn / Passkey (FIDO2 — Touch ID, Face ID, security keys)
- OIDC (Google, Microsoft)
- SAML (Keycloak)
Webhooks
Subscribe to document events (created, updated, deleted, ACL changed) and receive HTTP callbacks. Supports Basic, Bearer, API key, and HMAC signing.
Import / Export
- ACP (Alfresco Content Package) import
- NemakiWare ZIP format with JSON metadata — preserves folder hierarchy, relationships, and IDs
- Filesystem import/export (admin)
Cloud Integration
| Feature | Google | Microsoft | |---------|--------|-----------| | OIDC login | Google Account | Microsoft Account | | Cloud Drive import | Google Drive | OneDrive | | Directory sync | Google Workspace | Entra ID |
Archive & Retention (Beta)
- Scheduled archival of expired or stale documents
- Cold storage to Amazon S3 (with Legal Hold support)
- COPY mode (keep local + S3) or MOVE mode (S3 only)
- Restore from archive, download archived content
Architecture
┌───────────────┐
│ React UI │
└──────┬────────┘
│
┌──────────┐ MCP/REST ┌──────┴────────┐ Embedding ┌────────────┐
│ AI Agent ├───────────►│ NemakiWare ├─────────────►│ TEI / Bedrock │
└──────────┘ │ (Tomcat 11) │ └────────────┘
└──┬────────┬───┘
│ │
┌─────┘ └─────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ CouchDB │ │ Solr │
│ (data) │ │ (search) │
└──────────┘ └──────────┘
Technical Stack
| Component | Technology | |-----------|------------| | Server | Tomcat 11 (Jakarta EE 11, Virtual Threads) | | Framework | Spring 7, Apache Chemistry OpenCMIS | | Database | CouchDB 3.x | | Search | Apache Solr 9.x (full-text + DenseVector) | | UI | React 19, TypeScript, Vite 7, Ant Design 5 | | Java | 21 (required) |
Project Structure
NemakiWare/
├── core/ # Server (Spring + OpenCMIS)
│ └── src/main/webapp/ui/ # React SPA (TypeScript + Vite)
├── docker/ # Docker Compose configurations
├── solr/ # Solr configuration + vector schema
└── common/ # Shared utilities
REST API
RAG Search
# Semantic search
curl -u admin:admin -X POST \
-H "Content-Type: application/json" \
-d '{"query":"quarterly revenue report","topK":5,"minScore":0.6}' \
http://localhost:8080/core/api/v1/cmis/repositories/bedroom/rag/search
# Find similar documents
curl -u admin:admin \
http://localhost:8080/core/api/v1/cmis/repositories/bedroom/rag/similar/{documentId}
# Health check
curl -u admin:admin \
http://localhost:8080/core/api/v1/cmis/repositories/bedroom/rag/health
CMIS Browser Binding
# List children of root folder
curl -u admin:admin \
"http://localhost:8080/core/browser/bedroom/root?cmisselector=children"
# Create a document
curl -u admin:admin -X POST \
-F "cmisaction=createDocument" \
-F "propertyId[0]=cmis:objectTypeId" -F "propertyValue[0]=cmis:document" \
-F "propertyId[1]=cmis:name" -F "propertyValue[1]=report.pdf" \
-F "file=@report.pdf" \
"http://localhost:8080/core/browser/bedroom/root"
Development
Prerequisites
- Java 21, Maven 3.6+, Node.js 18+
- Docker (for CouchDB)
Development Server (without Docker)
# Start CouchDB
docker run -d --name couchdb-dev -p 5984:5984 \
-e COUCHDB_USER=admin -e COUCHDB_PASSWORD=password couchdb:3
# Start backend (Jetty, search disabled)
cd core && ./start-jetty-dev.sh
# Start frontend dev server (hot reload)
cd core/src/main/webapp/ui && npm run dev
Rebuilding After Changes
# Rebuild UI + WAR + deploy (never use docker compose restart)
cd core/src/main/webapp/ui && npm run build && cd ../../../..
mvn clean package -f core/pom.xml -Pdevelopment -DskipTests -q
cp core/target/core.war docker/core/core.war
cd docker && docker compose -f docker-compose-simple.yml up -d --build --force-recreate core
Testing
# CMIS TCK tests (requires running Docker environment)
mvn test -Dtest=BasicsTestGroup,TypesTestGroup,ControlTestGroup,VersioningTestGroup \
-f core/pom.xml -Pdevelopment
# Playwright E2E tests
cd core/src/main/webapp/ui && npx playwright test --project=chromium
# QA integration tests
./qa-test.sh qa
OpenCMIS JAR Resolution
NemakiWare uses custom OpenCMIS 1.1.0-nemakiware JARs (Jakarta EE compatible). Pre-built JARs are in lib/built-jars/ and must be installed before the first build:
./scripts/install-opencmis-local.sh
Documentation
| Document | Description | |----------|-------------| | Architecture | System architecture overview | | AWS Deployment | Production deployment on AWS | | Bedrock Embedding | Amazon Bedrock setup | | Cloud Integration | Google / Microsoft setup | | Archive Enhancement | Retention & cold storage |
Etymology
"Nemaki" derives from the Japanese word "寝巻き" (pajamas). Relax and enjoy happy enterprise time as if you are lying on the couch in your room!
License
Copyright (c) 2013-2026 aegif.
NemakiWare is Open Source software licensed under the GNU Affero General Public License version 3. See legal/LICENSE for details.
