CloudMind
CloudMind is an open source, Cloudflare-native, serverless-first private AI memory layer for the AI era.
Install / Use
/learn @evepupil/CloudMindQuality Score
Category
Development & EngineeringSupported Platforms
README
CloudMind
CloudMind is an open source, Cloudflare-native, serverless-first private AI memory layer for the AI era.
It is designed as a BYOC (Bring Your Own Cloud) project:
- deploy to your own Cloudflare account
- keep raw assets, derived content, and indexes under your control
- stay private by default instead of depending on a hosted SaaS operator
- benefit from Cloudflare-native deployment, availability, and low ops overhead
- use Cloudflare-native infrastructure by default
- preserve abstraction boundaries for future migration
Use Cases
CloudMind is not limited to a traditional personal knowledge base. It is intended for scenarios such as:
- building a personal AI memory layer that can be searched, cited, and reused
- turning saved URLs, notes, PDFs, and AI conversations into structured context
- grounding LLM applications with user-owned knowledge instead of SaaS-locked data
- exposing a private memory and context system through Web UI, REST APIs, browser extensions, MCP tools, and future interfaces
- serving as a portable foundation for retrieval, agent memory, and context engineering workflows
Overview
CloudMind ingests URLs, notes, PDFs, browser-captured data, and AI-originated content into a unified memory layer, then runs a processing pipeline to produce:
- normalized content
- summaries
- chunks
- embeddings
- searchable and answerable memory assets
The current implementation is a single HonoX full-stack app with:
- Web UI
- REST API
- remote MCP server
- queue-driven ingest workflows
Tech Stack
| Layer | Choice | | --- | --- | | Full-stack framework | HonoX + Hono | | Language | TypeScript | | Validation | Zod | | Database | Cloudflare D1 | | ORM | Drizzle ORM | | Blob storage | Cloudflare R2 | | Vector index | Cloudflare Vectorize | | Async processing | Cloudflare Queues | | AI provider | Cloudflare Workers AI | | Testing | Vitest | | Lint / format | Biome |
Architecture
CloudMind keeps business logic separated from infrastructure details. The core service layer is written against ports so the default Cloudflare implementation can be replaced later.
Key boundaries:
AssetRepositoryBlobStoreVectorStoreJobQueueAIProviderWorkflowRepository
Current infrastructure mapping:
| Port | Default implementation |
| --- | --- |
| AssetRepository | D1 + Drizzle |
| WorkflowRepository | D1 + Drizzle |
| BlobStore | R2 |
| VectorStore | Vectorize |
| JobQueue | Cloudflare Queues |
| AIProvider | Workers AI |
This keeps the application aligned with a fast Cloudflare-native MVP while leaving room for migration to PostgreSQL, pgvector, S3-compatible storage, or other model providers later.
Processing Pipeline
The ingest system is workflow-driven. Assets are processed through type-specific workflows:
note_ingest_v1url_ingest_v1pdf_ingest_v1
Typical flow:
- create asset metadata
- persist raw input
- create workflow run
- normalize and persist clean content
- generate summary
- split into chunks
- create embeddings
- write vectors and chunk metadata
- finalize asset state
The queue consumer is wired through app/server.ts, and workflow dispatch is resolved through src/features/workflows/server/registry.ts.
Retrieval Model
Search and chat are built on a hybrid retrieval strategy:
- chunk-level semantic recall from Vectorize
- summary-level fallback matches from D1
- source-aware answer generation for chat responses
This allows:
- retrieval of precise local passages when chunk vectors are available
- graceful fallback to summary-only assets
- grounded responses with source references
Web Surface
| Route | Purpose |
| --- | --- |
| / | home |
| /capture | ingest entry page |
| /assets | asset list |
| /assets/:id | asset detail |
| /search | semantic retrieval UI |
| /ask | memory-grounded Q&A |
API Surface
Ingest
POST /api/ingest/textPOST /api/ingest/urlPOST /api/ingest/filePOST /api/assets/:id/processPOST /api/assets/backfill/chunks
Assets
GET /api/assetsGET /api/assets/:idPATCH /api/assets/:idDELETE /api/assets/:idGET /api/assets/:id/jobsGET /api/assets/:id/workflows
Workflows
GET /api/workflows/:id
Search / Chat / Health
POST /api/searchPOST /api/chatGET /api/health
MCP Server
CloudMind exposes a remote MCP server over stateless HTTP at:
POST /mcp
Available tools:
save_assetsearch_assetsget_assetask_library
Tool semantics:
save_asset: ingest a text note or URL into the memory layersearch_assets: run semantic retrieval and return matched chunks or summary hitsget_asset: fetch asset detail by IDask_library: answer a question with grounded memory evidence
GET /mcp and DELETE /mcp are intentionally rejected with 405 Method not allowed.
Example MCP-oriented capabilities are implemented in src/features/mcp/server/service.ts and routed in src/features/mcp/server/routes.ts.
Example Requests
Create a text asset:
curl -X POST http://localhost:5173/api/ingest/text \
-H "Content-Type: application/json" \
-d '{
"title": "Cloudflare Queues notes",
"content": "Queues drive async workflow execution in CloudMind."
}'
Run semantic search:
curl -X POST http://localhost:5173/api/search \
-H "Content-Type: application/json" \
-d '{
"query": "queue-driven ingestion",
"page": 1,
"pageSize": 10
}'
Ask the memory layer:
curl -X POST http://localhost:5173/api/chat \
-H "Content-Type: application/json" \
-d '{
"question": "How does CloudMind process ingested content?",
"topK": 5
}'
Project Structure
app/
routes/ HonoX page routes
server.ts app entry and queue consumer entry
src/
core/ domain ports and core contracts
env.ts Cloudflare binding types
features/
assets/ asset query and management
chat/ memory-grounded Q&A
ingest/ ingest entrypoints and orchestration
mcp/ remote MCP server
search/ semantic retrieval
workflows/ workflow runtime and definitions
platform/
ai/ Workers AI adapter
blob/ R2 adapter
db/ D1 repositories and schema
queue/ Queue adapter
vector/ Vectorize adapter
drizzle/ D1 migrations
tests/unit/ Vitest unit tests
Local Development
Install dependencies and run the app:
npm install
npm run dev
Useful scripts:
npm run build
npm run worker:dev
npm run worker:deploy
npm run db:migrate:remote
npm run deploy
npm run deploy:bootstrap
npm run deploy:one-click
npm run typecheck
npm run lint
npm run format
npm run test
One-Click Deploy (New Users)
Option A: Deploy Button (GitHub -> Cloudflare Dashboard)
Use the button at the top of this README, or open:
- https://deploy.workers.cloudflare.com/?url=https://github.com/evepupil/CloudMind
The Deploy flow will guide you to connect your repo and provision Cloudflare resources before deployment.
Option B: Local One-Command Bootstrap
For first-time setup in your own Cloudflare account:
npm install
npm run deploy:one-click -- --prefix my-cloudmind
This command will:
- create D1 / R2 / Vectorize / Queue resources
- write bindings to
wrangler.jsonc - apply D1 migrations from
drizzle/ - run
npm run deploy
If you only want to initialize resources first:
npm run deploy:bootstrap -- --prefix my-cloudmind
Existing Project Note
wrangler.jsonc in this repository uses existing project bindings.
If you deploy to a fresh account, use either:
- Deploy Button provisioning flow, or
npm run deploy:one-clickbootstrap script
Cloudflare Bindings
The app expects these bindings, defined in wrangler.jsonc:
DBASSET_FILESASSET_VECTORSWORKFLOW_QUEUEAI
Binding types are declared in src/env.ts.
Testing
The repository includes unit coverage for:
- ingest services and routes
- asset services and routes
- search services and routes
- chat services and routes
- MCP routes
- workflow services
- Workers AI adapter
Run the baseline verification suite with:
npm run typecheck
npm run lint
npm run test
Design Notes
Important implementation constraints:
- raw assets are retained; AI-derived outputs are recomputable
- infrastructure details should not leak across business logic
- queue-driven workflows are preferred over tightly coupled synchronous pipelines
- AI outputs are advisory, replaceable, and retryable
- retrieval and chat should degrade gracefully when some derived artifacts are missing
For product direction and architectural constraints, see AGENTS.md.
