Substack2md
Create a local markdown version of your favorite Substack newsletters
Install / Use
/learn @snapsynapse/Substack2mdREADME
substack2md
Convert Substack posts to clean, Obsidian-friendly Markdown using your authenticated browser session.
Why This Exists
Substack doesn't let you bulk-export your reading list or subscriptions in a useful format. This tool:
- Uses your logged-in browser via Chrome DevTools Protocol (CDP)
- Preserves frontmatter metadata
- Converts images/embeds to links (Obsidian-friendly)
- Rewrites cross-references as wikilinks
[[YYYY-MM-DD-slug]] - Organizes by publication into folders
Features
- No password management - Uses your live browser session
- Batch processing - Single URLs or text files with multiple URLs
- Sequential with delays - Configurable sleep between requests to be polite
- Obsidian wikilinks - Auto-converts internal links to existing notes
- Configurable naming - Map publication slugs to custom directory names
- Transcript cleaning - Strips timestamps and speaker labels from podcast transcripts
Installation
# Clone the repo
git clone https://github.com/yourusername/substack2md.git
cd substack2md
# Install dependencies
pip install -r requirements.txt
Quick Start
1. Launch Your Browser with Remote Debugging
Brave (Recommended):
open -na "Brave Browser" --args \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.brave-cdp-profile"
Chrome (Apple Silicon):
arch -arm64 /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.chrome-cdp-profile"
Chrome (Intel):
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.chrome-cdp-profile"
2. Log Into Substack
In the browser window that just opened, navigate to Substack and log in normally.
3. Convert Posts
Single URL:
python substack2md.py https://natesnewsletter.substack.com/p/latest-post
Multiple URLs from file:
python substack2md.py --urls-file my-reading-list.txt
Specify output directory:
python substack2md.py https://daveshap.substack.com/p/post-slug --base-dir ~/my-notes
Configuration
Environment Variables
# Set default base directory
export SUBSTACK2MD_BASE_DIR=~/Documents/substack-notes
# Set config file location
export SUBSTACK2MD_CONFIG=~/.config/substack2md/config.yaml
Config File
Create config.yaml in the script directory or specify with --config:
# Base directory for markdown output
base_dir: ~/Documents/substack-notes
# Map publication slugs to custom directory names
publication_mappings:
signalsandsubtractions: Signals_And_Subtractions
natesnewsletter: Nates_Notes
daveshap: David_Shapiro
See config.yaml.example for a template.
Usage Examples
# Single post with custom output directory
python substack2md.py https://pub.substack.com/p/slug --base-dir ~/vault
# Batch processing with slower delays (be nice to servers)
python substack2md.py --urls-file urls.txt --sleep-ms 500
# Save HTML alongside markdown (for debugging)
python substack2md.py URL --also-save-html
# Overwrite existing files
python substack2md.py URL --overwrite
# Process from existing markdown export (cleanup only)
python substack2md.py --from-md export.md --url https://pub.substack.com/p/slug
URL File Format
Create a text file with one URL per line:
https://signalsandsubtractions.substack.com/p/the-trust-gap
https://natesnewsletter.substack.com/p/i-surveyed-100-ai-tools-that-launched
# Comments start with #
https://daveshap.substack.com/p/the-merits-of-doing-things-the-hard
Output Structure
~/Documents/substack-notes/
├── Signals_And_Subtractions/
│ └── 2025-09-29-the-trust-gap.md
├── Nates_Notes/
│ ├── 2025-10-20-i-surveyed-100-ai-tools-that-launched.md
│ └── 2025-10-18-i-read-17-hours-of-ai-news-this-week.md
└── David_Shapiro/
└── 2025-10-18-the-merits-of-doing-things-the-hard.md
Markdown Frontmatter
Each file includes YAML frontmatter:
---
title: "Post Title"
subtitle: "Optional subtitle"
author: "David Shapiro"
publication: "daveshap"
published: "2025-10-18"
updated: "2025-10-18"
retrieved: "2025-10-20T15:30:00Z"
url: "https://daveshap.substack.com/p/post-slug"
canonical: "https://daveshap.substack.com/p/post-slug"
slug: "post-slug"
tags: [substack, ai, automation]
image: "https://substackcdn.com/image.jpg"
links_internal: 3
links_external: 12
source: "substack2md v1.1.0"
---
Content starts here...
Troubleshooting
"No CDP connection"
- Make sure your browser launched with
--remote-debugging-port=9222 - Check that no other process is using port 9222
- Try closing all Chrome/Brave windows and launching again
"Missing modules" error
pip install -r requirements.txt
URLs not being converted to wikilinks
- The tool only converts links to posts you've already downloaded
- Run a second pass to catch cross-references
Rate limiting / bot detection
- Increase
--sleep-ms(default: 150ms) - Use smaller batches
- Substack shouldn't rate-limit authenticated sessions, but YMMV
Advanced Options
python substack2md.py --help
options:
--urls-file FILE File with URLs, one per line
--from-md FILE Clean existing markdown export
--url URL URL for --from-md mode
--base-dir DIR Output directory
--config FILE Path to config.yaml
--also-save-html Save HTML sidecar files
--overwrite Replace existing files
--cdp-host HOST CDP hostname (default: 127.0.0.1)
--cdp-port PORT CDP port (default: 9222)
--timeout SECONDS Page load timeout (default: 45)
--retries N Retry failed URLs N times (default: 2)
--sleep-ms MS Delay between requests (default: 150)
Contributing
Pull requests welcome! Areas for improvement:
- Support for other platforms (Medium, Ghost, etc.)
- Better error handling
- Progress bars for batch processing
- Parallel processing option
- Export to other formats
License
MIT License - see LICENSE file for details.
Credits
Built with:
Disclaimer
This tool is for personal archival purposes. Respect content creators' rights and Substack's terms of service. DON'T STEAL! STEALING IS BAD BAD BAD!!! Getting better utility from Substacks you already support is not. Sharing without permission is the line, don't cross it.
