SkillAgentSearch skills...

Cognisync

Filesystem-first framework for LLM-maintained knowledge bases

Install / Use

/learn @shrijacked/Cognisync
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Cognisync

CI

Cognisync is a filesystem-first framework for building LLM-maintained knowledge bases.

It turns the workflow described by Andrej Karpathy into a reusable open source system:

  1. Collect raw source material into a workspace.
  2. Index and normalize that material into a deterministic manifest.
  3. Generate structured work packets for LLM agents to compile a wiki.
  4. Lint the resulting knowledge base for integrity problems.
  5. Answer questions by searching the corpus and rendering outputs back into Markdown, slides, and other artifacts.

The goal is not to replace your favorite model or agent runner. The goal is to provide the workspace model, orchestration contracts, indexing primitives, and output formats that let people build serious tooling around this pattern.

Core Ideas

  • Filesystem-native: raw/, wiki/, and outputs/ stay readable in tools like Obsidian.
  • LLM-compatible: the framework produces prompt packets and execution plans for external LLM CLIs.
  • Incremental: every scan, lint pass, query, and report can be filed back into the workspace.
  • Deterministic where possible: indexing, search, linting, and report scaffolding work without network access.
  • Extensible: users can write adapters, renderers, and orchestration layers on top of the core contracts.

Workspace Layout

workspace/
├── AGENTS.md
├── log.md
├── raw/
│   └── ... source documents, repos, datasets, images
├── wiki/
│   ├── index.md
│   ├── sources.md
│   ├── concepts.md
│   ├── queries.md
│   ├── sources/
│   ├── concepts/
│   └── queries/
├── outputs/
│   ├── reports/
│   │   ├── change-summaries/
│   │   ├── exports/
│   │   ├── research-jobs/
│   │   ├── review-exports/
│   │   └── review-ui/
│   └── slides/
├── prompts/
└── .cognisync/
    ├── access.json
    ├── audit.json
    ├── collaboration.json
    ├── config.json
    ├── control-plane.json
    ├── graph.json
    ├── index.json
    ├── notifications.json
    ├── review-actions.json
    ├── review-queue.json
    ├── runs/
    ├── shared-workspace.json
    ├── sync/
    ├── sources.json
    ├── usage.json
    └── plans/

What Ships In This Reference Implementation

  • Workspace scaffolding
  • A root AGENTS.md workspace schema that explains the file-native contract to agents
  • A root log.md activity ledger that records init, ingest, lint, compile, research, and maintenance work
  • Deterministic corpus scanner and manifest builder
  • Stable source and graph manifests under .cognisync/
  • Stable review queue manifests for graph follow-up work under .cognisync/
  • Durable review-action state so accepted concepts, merge decisions, and dismissals survive rescans
  • Durable collaboration threads under .cognisync/collaboration.json so artifact review requests, comments, approvals, and change requests travel with the workspace
  • Durable shared-workspace state under .cognisync/shared-workspace.json so peer bindings, accepted remote principals, and handoff bundles stay file-native too
  • Durable control-plane state under .cognisync/control-plane.json so invites, bearer tokens, and scheduler ticks stay file-native too
  • Regenerated wiki navigation catalogs at wiki/index.md, wiki/sources.md, wiki/concepts.md, and wiki/queries.md
  • Deterministic corpus change summaries after scan, ingest, maintenance, and research runs
  • Export bridges for JSONL research datasets, training bundles, and presentation bundles
  • Evaluation reports over persisted research runs
  • Research job notes and validation reports under outputs/reports/research-jobs/
  • Markdown-aware search over raw/ and wiki/
  • Compile planner for missing summaries, concept pages, and repair work
  • Knowledge-base linter for broken links, missing summaries, graph conflicts, and duplicate concepts
  • Markdown and Marp report renderers
  • Research and compile run manifests with persisted validation state
  • Command adapter contracts for wiring in external LLM CLIs
  • A tested Python API and CLI

Quickstart

python3 -m pip install -e .
cognisync init .
cognisync doctor --strict
cognisync ingest batch sources.json
cognisync adapter list
cognisync adapter install codex --profile codex
cognisync compile --profile codex --strict
cognisync research "what are the main themes in this workspace?" --profile codex --mode memo --slides

Try The Demo

If you want a concrete workspace immediately, Cognisync can scaffold a polished demo garden:

cognisync demo

By default this writes a browsable example into examples/research-garden/. The demo includes:

  • seeded raw source material
  • compiled source summaries and concept pages
  • a filed query page
  • generated reports, slides, and prompt packets

You can inspect the checked-in example in examples/research-garden or follow the walkthrough in Demo Walkthrough.

Operator Workflow

Cognisync is strongest when you use it as a loop, not a bag of separate commands:

cognisync doctor --strict
cognisync ingest batch sources.json
cognisync review
cognisync collab request-review outputs/reports/report.md --assign reviewer-1 --actor-id editor-1
cognisync maintain
cognisync compile --profile codex --strict
cognisync research "what changed in this corpus?" --profile codex --slides

The operator-facing workflow is documented in Operator Workflows.

Each scan, ingest, maintenance, and research pass now also writes a small change artifact into outputs/reports/change-summaries/ so the workspace records what moved:

  • artifact and source count deltas
  • orphan-page delta
  • graph node and edge deltas
  • new concept pages
  • newly resolved merges
  • newly dismissed review items
  • newly surfaced conflicts
  • suggested follow-up questions based on new conflicts, assertion growth, and coverage gaps

The workspace root now carries two operator-facing files inspired by the idea-file workflow:

  • AGENTS.md is the durable workspace schema that tells an LLM how to treat raw/, wiki/, outputs/, and .cognisync/
  • log.md is an append-only human-readable timeline of important workspace actions

The wiki root also regenerates four navigation surfaces on refresh:

  • wiki/index.md is the top-level agent entry point
  • wiki/sources.md, wiki/concepts.md, and wiki/queries.md catalog the durable pages in each section
  • source and concept catalogs count as stable navigation backlinks
  • query catalogs only become backlink-bearing when an explicit review action promotes them, so query pages can still surface as orphan review candidates until they are intentionally filed

The richer ingest layer now makes the loop more useful before an LLM even runs:

  • ingest pdf preserves the source PDF and writes a sidecar Markdown file with extracted text and metadata
  • ingest url captures page metadata such as description, canonical URL, headings, discovered links, content stats, and local image captures
  • ingest repo captures repository stats, language signals, recent commits, and a nested tree snapshot in the repo manifest, whether the source is local or cloned from a remote Git URL
  • ingest urls reads a plain-text or JSON URL list into raw/urls/
  • ingest sitemap expands a sitemap into individual URL captures
  • ingest batch processes a JSON manifest so larger source sets can land in one deterministic pass, including URL lists and sitemaps

Batch ingest accepts a JSON list or an object with an items list:

{
  "items": [
    {"kind": "url", "source": "https://example.com/article"},
    {"kind": "urls", "source": "/path/to/urls.txt"},
    {"kind": "sitemap", "source": "/path/to/sitemap.xml"},
    {"kind": "pdf", "source": "/path/to/paper.pdf"},
    {"kind": "repo", "source": "https://github.com/example/repo.git"}
  ]
}

The query and research outputs are now more citation-friendly by default:

  • reports render an evidence summary with inline source ids like [S1]
  • reports now render Fact Blocks that separate source-backed claims from the looser narrative sections
  • source blocks include path, source kind, score, retrieval reason, snippet, and embedded-image hints
  • compile packets include input-context excerpts so external agents see richer raw context up front
  • research runs validate inline citations and persist their status into .cognisync/runs/
  • scans now materialize stable source, graph, and review manifests at .cognisync/sources.json, .cognisync/graph.json, and .cognisync/review-queue.json

Research Command

cognisync research is the opinionated operator surface for question-driven work:

cognisync research "how do agent loops use memory?" --profile claude --mode memo --slides

It scans the workspace, searches the corpus, renders a cited report, builds a prompt packet, optionally runs the packet through an adapter profile, validates inline citations, and files the resulting answer back into the workspace.

Every research run now also writes:

  • a research plan in .cognisync/plans/
  • a run manifest in .cognisync/runs/
  • a research job workspace in outputs/reports/research-jobs/
  • a research change summary in outputs/reports/change-summaries/
  • enough state to resume execution later without rebuilding the packet

Research now supports orchestration profiles too:

  • synthesis-report for working-set and outline-driven synthesis
  • literature-review for paper matrices and gap tracking
  • repo-analysis for code-surface and interface mapping
  • contradiction-finding for claim ledgers and disagreement handling
  • market-scan for competitor-grid and positioning work

Research verification is now stricter too:

  • unknown citations fail the run
  • uncited narrative claims fail the run
  • malformed answers, such as missing top-level headings, fail the run
View on GitHub
GitHub Stars6
CategoryDevelopment
Updated23m ago
Forks0

Languages

Python

Security Score

85/100

Audited on Apr 8, 2026

No findings