Markdowndb
Turn markdown files into structured, queryable data with JS. Build markdown-powered docs, blogs, and sites quickly and reliably.
Install / Use
/learn @flowershow/MarkdowndbREADME
MarkdownDB
MarkdownDB is a javascript library that turns markdown files into structured queryable databaase (SQL-based and simple JSON). It helps you build rich markdown-powered sites easily and reliably. Specifically it:
- Parses your markdown files to extract structured data (frontmatter, tags etc) and builds a queryable index either in JSON files or a local database (SQLite, MySQL, or PostgreSQL)
- Provides a lightweight javascript API for querying the index and using the data files into your application
Database Support
MarkdownDB supports multiple database backends through Knex.js:
- SQLite (default) - Perfect for local development and small to medium sites. No additional setup required.
- MySQL - Great for larger sites and when you need a separate database server. Requires
mysql2package. - PostgreSQL - Enterprise-grade database with advanced features. Requires
pgpackage.
All databases provide the same API and features, so you can easily switch between them based on your needs.
Features and Roadmap
- [x] Index a folder of files - create a db index given a folder of markdown and other files
- [x] Command line tool for indexing: Create a markdowndb (index) on the command line v0.1
- [x] SQL(ite) index v0.2
- [x] MySQL and PostgreSQL support - Use MySQL or PostgreSQL as your database backend
- [x] JSON index v0.6
- [ ] BONUS Index multiple folders (with support for configuring e.g. prefixing in some way e.g. i have all my blog files in this separate folder over here)
- [x] Configuration for Including/Excluding Files in the folder
Extract structured data like:
- [x] Frontmatter metadata: Extract markdown frontmatter and add in a metadata field
- [ ] deal with casting types e.g. string, number so that we can query in useful ways e.g. find me all blog posts before date X
- [x] Tags: Extracts tags in markdown pages
- [x] Extract tags in frontmatter v0.1
- [x] Extract tags in body like
#abcv0.5
- [x] Links: links between files like
[hello](abc.md)or wikilink style[[xyz]]so we can compute backlinks or deadlinks etc (see #4) v0.2 - [x] Tasks: extract tasks like this
- [ ] this is a task(See obsidian data view) v0.4
Data enhancement and validation
- [x] Computed fields: add new metadata properties based on existing metadata e.g. a slug field computed from title field; or, adding a title based on the first h1 heading in a doc; or, a type field based on the folder of the file (e.g. these are blog posts). cf https://www.contentlayer.dev/docs/reference/source-files/define-document-type#computedfields.
- [ ] 🚧 Data validation and Document Types: validate metadata against a schema/type so that I know the data in the database is "valid" #55
- [ ] BYOT (bring your own types): i want to create my own types ... so that when i get an object out it is cast to the right typescript type
Quick start
Have a folder of markdown content
For example, your blog posts. Each file can have a YAML frontmatter header with metadata like title, date, tags, etc.
---
title: My first blog post
date: 2021-01-01
tags: [a, b, c]
author: John Doe
---
# My first blog post
This is my first blog post.
I'm using MarkdownDB to manage my blog posts.
Index the files with MarkdownDB
Use the npm mddb package to index Markdown files into an SQLite database. This will create a markdown.db file in the current directory. You can preview it with any SQLite viewer, e.g. https://sqlitebrowser.org/.
# npx mddb <path-to-folder-with-your-md-files>
npx mddb ./blog
You can also index multiple directories at once:
# Index multiple directories into a single database
npx mddb ./blog ./docs ./notes
If you pass a file path, the CLI prints the parsed JSON to stdout:
npx mddb ./blog/post.md
Non-markdown extensions still parse, but the CLI warns: "Is this a markdown file? Expected .md, .markdown, or .mdx."
Watching for Changes
To monitor files for changes and update the database accordingly, simply add the --watch flag to the command:
npx mddb ./blog --watch
This command will continuously watch for any modifications in the specified folder (./blog), automatically rebuilding the database whenever a change is detected.
Execute a Script with MarkdownDB APIs
Run a JS module that can call the MarkdownDB APIs directly. Arguments after the script path are passed through.
Internally, --exec uses node --import to set up module resolution so mddb is available.
npx mddb --exec ./scripts/report.mjs --flag value
You can also pass a module over stdin:
cat ./scripts/report.mjs | npx mddb --exec -
Query your files with SQL...
E.g. get all the files with with tag a.
SELECT files.*
FROM files
INNER JOIN file_tags ON files._id = file_tags.file
WHERE file_tags.tag = 'a'
...or using MarkdownDB Node.js API in a framework of your choice!
Use our Node API to query your data for your blog, wiki, docs, digital garden, or anything you want!
Install mddb package in your project:
npm install mddb
Now, once the data is in the database, you can add the following script to your project (e.g. in /lib folder). It will allow you to establish a single connection to the database and use it across you app.
SQLite (default)
// @/lib/mddb.mjs
import { MarkdownDB } from "mddb";
const dbPath = "markdown.db";
const client = new MarkdownDB({
client: "sqlite3",
connection: {
filename: dbPath,
},
});
const clientPromise = client.init();
export default clientPromise;
MySQL
First, install the MySQL driver:
npm install mysql2
Then configure MarkdownDB to use MySQL:
// @/lib/mddb.mjs
import { MarkdownDB } from "mddb";
const client = new MarkdownDB({
client: "mysql2",
connection: {
host: "localhost",
port: 3306,
user: "your_username",
password: "your_password",
database: "your_database",
},
});
const clientPromise = client.init();
export default clientPromise;
PostgreSQL
First, install the PostgreSQL driver:
npm install pg
Then configure MarkdownDB to use PostgreSQL:
// @/lib/mddb.mjs
import { MarkdownDB } from "mddb";
const client = new MarkdownDB({
client: "pg",
connection: {
host: "localhost",
port: 5432,
user: "your_username",
password: "your_password",
database: "your_database",
},
});
const clientPromise = client.init();
export default clientPromise;
Now, you can import it across your project to query the database, e.g.:
import clientPromise from "@/lib/mddb";
const mddb = await clientPromise;
const blogs = await mddb.getFiles({
folder: "blog",
extensions: ["md", "mdx"],
});
Process a single file or stream
Use processMarkdown when you want to parse one markdown source without indexing a folder (e.g. in a Worker):
import { processMarkdown } from "mddb";
const source = "# Hello";
const fileInfo = await processMarkdown(source, {
filePath: "posts/hello.md",
rootFolder: "posts",
pathToUrlResolver: (inputPath) => inputPath,
});
You can pass a Node.js Readable stream or an ArrayBuffer as input. If you omit folder context, backlinks and folder-wide link resolution are not available.
Computed Fields
This feature helps you define functions that compute additional fields you want to include.
Step 1: Define the Computed Field Function
Next, define a function that computes the additional field you want to include. In this example, we have a function named addTitle that extracts the title from the first heading in the AST (Abstract Syntax Tree) of a Markdown file.
const addTitle = (fileInfo, ast) => {
// Find the first header node in the AST
const headerNode = ast.children.find((node) => node.type === "heading");
// Extract the text content from the header node
const title = headerNode
? headerNode.children.map((child) => child.value).join("")
: "";
// Add the title property to the fileInfo
fileInfo.title = title;
};
Step 2: Indexing the Folder with Computed Fields
Now, use the client.indexFolder method to scan and index the folder containing your Markdown files. Pass the addTitle function in the computedFields option array to include the computed title in the database.
client.indexFolder(folderPath: "PATH_TO_FOLDER", customConfig: { computedFields: [addTitle] });
Configuring markdowndb.config.js
- Implement computed fields to dynamically calculate values based on specified logic or dependencies.
- Specify the patterns for including or excluding files in MarkdownDB.
Example Configuration
Here's an example markdowndb.config.js with custom configurations:
export default {
computedFields: [
(fileInfo, ast) => {
// Your custom logic here
},
],
include: ["docs/**/*.md"], // Include only files matching this pattern
exclude: ["drafts/**/*.md"], // Exclude those files matching this pattern
};
(Optional) Index your files in a prebuild script
{
"name": "my-mddb-app",
"scripts": {
...
"mddb": "mddb <path-to-your-content-folder>",
"prebuild": "npm run mddb"
},
...
}
With Next.js project
For example, in your Next.js project's pages, you could do:
// @/pages/blog/index.js
import React from "react";
import clientPromise from "@/lib/mddb.mjs";
export default function Blog({ blogs }) {
return (
<div>
<h1>Blog</h1>
<ul>
{blogs.map((blog) => (
<li key={blog.id}>
<a href={blog.url_path}>{blog.title}</a>
</li>
Related Skills
notion
350.1kNotion API for creating and managing pages, databases, and blocks.
feishu-drive
350.1k|
things-mac
350.1kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
350.1kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
